Rendering footnotes in tables and lists with FOP

[UPDATE: Meanwhile, FOP-1.0 has been released, which fixes the bug that informed this post. The workaround described below thus is only relevant for users of FOP versions 0.92 to 0.95. For the happiest FOPping experience, stop reading here and grab your copy of FOP-1.0!]

[…or skip the discussion and just download the files]

The reason

During the past couple of years, I’ve gathered some experience working with XML and related standards (XSLT, XSL-FO, XQuery). Part of our professional document production chain involves rendering PDF output from XML sources. I’ve grown into a big fan of Apache’s open source FOP processor since its now ancient version 0.20.5. Although the FOP code has been substantially revised and improved long since, the versions up to version 0.95 were haunted by one serious bug, which kept me from switching to an up-to-date version of FOP: footnotes inside lists or table cells got swallowed in PDF output.

On the other hand, FOP’s XSL-FO compliance rate has risen substantially in the recent versions, prompting me to find a way of dealing with this nasty show-stopper. Of course, I hope the FOP developers will be able to resolve this issue soon. In the mean time, I think I’ve found a way of circumventing (or at least alleviating) the problem (at stylesheet level; not at Java code level). Moreover, I think this approach might help other users as well, and other users might help improving this approach where it doesn’t.

Hence this initial blog post, in a mild blend of self-documentation and altruism. It will be quite technical and specific, but I hope to get the message clear. At least I’ll try, by:

  • starting from a stable, simple example. For (my own) convenience’s sake, I’ve crafted a TEI P5 example structure, since that’s where my expertise lies.
  • illustrating intermediate steps with (pointers to) XSL-FO code examples
  • illustrating the results with screenshots of corresponding PDF output of the examples

I’ve categorised this blog post under XML, XSL-FO, and XSLT as well. Although I’ll focus on the (theoretical) XSL-FO side of a solution, I’ll provide a link to a zip package containing sample XML documents and an XSLT stylesheet illustrating the final stage of this solution. However, some problems remain for which I don’t have an immediate solution. Therefore I welcome any comments.

The problem

Simply put, FOP (0.92-0.95) had troubles rendering fo:footnote areas occurring within fo:table or fo:list-block areas, an issue which is formally documented as a bug.  At the time of writing this post, the comments in the bug tracker suggested that  a) the bug wouldn’t be resolved for anytime soon and b) there was no known workaround. Consider for example, this standard TEI XML list containing a footnote:

<list xmlns="http://www.tei-c.org/ns/1.0">
  <item>Case 1: list[1]/item[1]
    <note>Case 1: list[1]/item[1]/note[1]</note>
  </item>
  <item>Case 1: list[1]/item[2]</item>
  <item>Case 1: list[1]/item[3]</item>
</list>


When transformed to a corresponding standard XSL-FO structure:

<fo:list-block xmlns:fo="http://www.w3.org/1999/XSL/Format"
    provisional-distance-between-starts="50pt"
    provisional-label-separation="10pt"
    start-indent="from-parent(start-indent) + 5pt">
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1: list[1]/item[1]
        <fo:footnote>
          <fo:inline font-size="8pt" vertical-align="super">1</fo:inline>
          <fo:footnote-body
              font-size="10pt" space-after="0.5em" end-indent="0px"
              start-indent="0px" text-align="start"
              font-style="normal" font-weight="normal">
            <fo:list-block>
              <fo:list-item>
                <fo:list-item-label end-indent="label-end()">
                  <fo:block>1</fo:block>
                </fo:list-item-label>
                <fo:list-item-body start-indent="body-start()">
                  <fo:block>Case 1: list[1]/item[1]/note[1]</fo:block>
                </fo:list-item-body>
              </fo:list-item>
            </fo:list-block>
          </fo:footnote-body>
        </fo:footnote>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1: list[1]/item[2]</fo:block>
    </fo:list-item-body>
  </fo:list-item>
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1: list[1]/item[3]</fo:block>
    </fo:list-item-body>
  </fo:list-item>
</fo:list-block>

…the PDF output will show the footnote marker in the first list item, but NOT the footnote body at the bottom of the page! The same happens when tables are involved:

footnoteproblem_list footnoteproblem_table
footnote problem in a list footnote problem in a table

Of course, this is quite uncomfortable: all goes well as long as the input documents don’t contain any footnotes inside tables or lists. But who can / wants to guarantee that?

Proposal 1: "relative endnotes"

A way in which this problem can be avoided, is by generating fo:footnote formatting objects for those footnotes outside the areas of their containing lists and tables. Key to this approach lies in this characteristic of the fo:footnote formatting object:

The fo:footnote formatting object does not generate any areas. The fo:footnote formatting object returns the areas generated and returned by its child fo:inline formatting object.

This means that a fo:footnote will only produce in-line areas through the contents of its fo:inline footnote marker. Consequently, if this marker is left empty, a fo:footnote will only generate an out-of-line block area at the bottom of the page.

For tables and lists containing footnotes, this could inform following approach:

  1. Process all contents as usual, except for footnotes. For the latter, instead of outputting a fo:footnote formatting object, just generate a fo:inline "dummy" marker, nothing more.
  2. Create a separate fo:block after the affected table / list, containing all footnotes. For each footnote, generate a complete fo:footnote structure, but leave the fo:inline footnote markers empty.

Applied to the previous example, this would generate following XSL-FO fragment:

<fo:list-block xmlns:fo="http://www.w3.org/1999/XSL/Format"
    provisional-distance-between-starts="50pt"
    provisional-label-separation="10pt"
    start-indent="from-parent(start-indent) + 5pt">
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1': list[1]/item[1]
        <fo:inline font-size="8pt" vertical-align="super">1</fo:inline>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1': list[1]/item[2]</fo:block>
    </fo:list-item-body>
  </fo:list-item>
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1': list[1]/item[3]</fo:block>
    </fo:list-item-body>
  </fo:list-item>
</fo:list-block>
<fo:block>
  <fo:footnote>
    <fo:inline font-size="8pt" vertical-align="super">1</fo:inline>
    <fo:footnote-body font-size="10pt"
        space-after="0.5em" end-indent="0px"
        start-indent="0px" text-align="start"
        font-style="normal" font-weight="normal">
      <fo:list-block>
        <fo:list-item>
          <fo:list-item-label end-indent="label-end()">
            <fo:block>1</fo:block>
          </fo:list-item-label>
          <fo:list-item-body start-indent="body-start()">
            <fo:block>Case 1': list[1]/item[1]/note[1]</fo:block>
          </fo:list-item-body>
        </fo:list-item>
      </fo:list-block>
    </fo:footnote-body>
  </fo:footnote>
</fo:block>

…and look: when rendered to PDF, this time the ‘footnote’ is present (although technically, it’s rather a "relative endnote" to its containing list)!

relativeendnote_list relativeendnote_table
relative endnote in a list relative endnote in a table

Evaluation

This approach, inspired by other solutions to vertical alignment issues seems quite elegant (simple) and efficient (powerful enough for nested tables / lists). However, it comes with a catch:

  • Theoretically, it works best for short lists or tables that don’t span different pages. For longer ones, however, all footnote bodies will end up after their containing table / list (as a kind of end notes to the table / list).
  • In practice, there is a case where FOP seems to choke: when the number of footnotes in a list or table grows too large, FOP hangs.

Under these provisos, this "relative endnote" approach might strike a fine balance between an efficient solution for most cases, and a sufficient compromise for longer lists or tables. However, the edge case where FOP reveals troubles with long tables / lists containing many footnotes leaves me uneasy.

Proposal 2: "relative footnotes"

This approach takes the reasoning one step further. It starts from 2 observations:

  1. As seen in the "relative endnote" approach, fo:footnote formatting objects with empty fo:inline footnote markers can be inserted without generating any extra whitespace between blocks.
  2. As with other block formatting objects, tables and lists can be stacked under each other without whitespace as if they formed a whole, if they have appropriate space, margin and padding properties.

From this it follows that if we can simulate footnotes in tables / lists via endnotes that are invisible as such (ie. they don’t generate extra inline areas under the affected tables / lists), it is equally possible to have endnotes per list item / table row. In other words: if tables or lists containing notes can be split up into atomic chunks, those atomic chunks can be followed by the relative endnotes they contain. Since lists and tables are composed of horizontal areas (list items and rows), we could treat each of those separately, create a separate single-item/row list or table for them in their own right and output the footnotes as relative endnotes to these tables / lists.

For the previous example, this would generate 3 separate lists, each containing 1 single item, of which the first one will be followed by a "relative endnote" block:

<fo:list-block provisional-distance-between-starts="50pt"
    provisional-label-separation="10pt"
    start-indent="from-parent(start-indent) + 5pt">
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1': list[1]/item[1]
        <fo:inline font-size="8pt" vertical-align="super">1</fo:inline>
      </fo:block>
    </fo:list-item-body>
  </fo:list-item>
</fo:list-block>
<fo:block>
  <fo:footnote>
    <fo:inline font-size="8pt" vertical-align="super"/>
      <fo:footnote-body font-size="10pt" space-after="0.5em"
          end-indent="0px" start-indent="0px" text-align="start"
          font-style="normal" font-weight="normal">
        <fo:list-block>
        <fo:list-item>
          <fo:list-item-label end-indent="label-end()">
            <fo:block>1</fo:block>
          </fo:list-item-label>
          <fo:list-item-body start-indent="body-start()">
            <fo:block>Case 1': list[1]/item[1]/note[1]</fo:block>
          </fo:list-item-body>
        </fo:list-item>
      </fo:list-block>
    </fo:footnote-body>
  </fo:footnote>
</fo:block>
<fo:list-block provisional-distance-between-starts="50pt"
    provisional-label-separation="10pt"
    start-indent="from-parent(start-indent) + 5pt">
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1': list[1]/item[2]</fo:block>
    </fo:list-item-body>
  </fo:list-item>
</fo:list-block>
<fo:list-block provisional-distance-between-starts="50pt"
    provisional-label-separation="10pt"
    start-indent="from-parent(start-indent) + 5pt">
  <fo:list-item>
    <fo:list-item-label end-indent="label-end()">
      <fo:block></fo:block>
    </fo:list-item-label>
    <fo:list-item-body start-indent="body-start()">
      <fo:block>Case 1': list[1]/item[3]</fo:block>
    </fo:list-item-body>
  </fo:list-item>
</fo:list-block>

Of course, this approach is easy enough for single-level lists or tables, but requires further consideration for nested ones. In this case, it is necessary not just to wrap up each list item into its own single-item list, but to mimic its complete superstructure. Otherwise indentation will be completely lost and it will be impossible to get the horizontal and vertical alignment straight.

Therefore, a complete treatment for "relative footnotes" should:

  1. For each table / list with footnotes, defer further processing to their child rows / items.
  2. For each table row / list item, reconstruct the complete superstructure (up to the last table / list) in which it occurs.
  3. For each step in this reconstruction, only output the corresponding XSL-FO structure without text contents. Only for the last step (ie. the row / item under consideration), output the text contents.
  4. If the table row / list item contains footnotes, output these as relative endnotes in a block after the table / list.
  5. If the table row / list item contains further nested tables or lists, repeat steps 1 to 5 for each of these.

This reconstruction can be done easier than ever with an XSLT 2.0 stylesheet and its native capabilities to assign intermediate node sets to variables. It requires some thought for proper treatment of padding settings and especially table borders, but since that’s mainly an XSLT matter, I won’t go into details here. For brevity’s sake, I won’t give an XSL-FO code example either (things soon get very verbose), but instead provide a screenshot of a complex case:

relativefootnote_complex
relative footnotes with mixed nesting tables and lists

Evaluation

From a theoretical angle, this "relative footnote" approach is appealing because it’s more complete than the "relative endnote" approach and avoids the problem FOP seems to have when the number of footnotes in lists or tables grows too large.

On the other hand, it complicates processing substantially, for dealing with nested lists and tables requires careful thinking about padding and border properties. Moreover, it generates a heavy load of XSL-FO structures: each table row / list item will mirror its complete superstructure, which gets proportionally complex and verbose as the nesting level increases. This concern can be lessened by strategies for minimisation:

  1. only applying the "relative footnote" approach to tables and lists containing footnotes, while treating all others regularly
  2. for tables / lists containing footnotes, only apply the "relative footnotes" technique to those rows / items with footnotes, while grouping others in regular tables / lists

With such optimisations, this approach looks like a promising solution to table/list-related footnote display problems with FOP. However, another problem looms, exposing the line-based approach of this technique as a limitation.

Relative, all too relative

The "relative footnote" approach goes a long way in circumventing the problem, but can’t cope with a further level of complexity. Consider the case where one table row contains multiple nesting tables or lists.

relativefootnote_problemarea
parallel nesting tables

The line-based nature of the "relative footnote" approach won’t be able to cope well when multiple levels of parallel lines are involved, as is the case with parallel nesting tables. In this case, alignment is the problem: within the containing table row (row 2 in this case), both cells align with each other. However, in their nesting tables, rows will align independently of their parallel counterparts, depending on the number of rows and the length of their text contents. With a strict line-based approach, this would force all rows at all levels to align with each other, producing unwanted whitespace as in following example (mocked-up in Word because I’m undecided if implementing it in the XSLT stylesheet is worth the trouble):

relativefootnote_imperfect
parallel nesting tables with improper alignment

This is where I’m stuck at the moment. I don’t see clear how parallel nesting tables should be treated. It is possible, however, to treat parallel nesting tables correctly in isolation, but integrating them in a strictly line-based approach will destroy their independent alignment.

Without modification, the "relative footnote" XSLT stylesheet will render the previous example as follows:

relativefootnote_problem
parallel nesting tables processed sequentially

While illustrating the correct logic of the stylesheet, the output clearly is sub-optimal: parallel nesting tables are processed sequentially, producing complete superstructure-tables for each row. I don’t have any clear ideas yet, but maybe some position-related properties could be worth investigating further:

  • if the width of the outer tables containing parallel nesting tables would be reduced to the required column width, maybe floating the latter on the right hand side of the former table could reconstruct their juxtaposed nature. Unfortunately, floating properties are not yet supported by FOP.
  • if both tables could be ‘laid over’ each other with absolute positioning, this could reconstruct their juxtaposed nature. My current knowledge about position-absolute doesn’t allow me even to predict if this route is worth investigating.

However undesirable from a theoretical point of view, I currently consider this as the end point of my quest, considering the fact that tables with footnotes and parallel nesting tables might be rare in the wild. Anyway, I hope to have found a solution for this problem before I’ll have to tackle it (or rather hope that FOP will soon have its footnote treatment properly fixed). In the mean time, I welcome any comments!

Wrapping up: conclusion and files

When processing PDF output from XML documents with FOP 0.92+, a serious drawback is the omission of footnotes occurring in tables or lists. In this post, 2 possible strategies were explored to circumvent the problem:

  1. "relative endnotes" approach: convert footnotes in tables or lists to endnotes to the affected table or list
    pro contra
    simple limited: footnotes appear as relative endnotes
      FOP hangs when the number of footnotes grows too large
  2. "relative footnotes" approach: convert tables or lists with footnotes to stacks of atomic tables or lists with relative endnotes
    pro contra
    footnotes appear as genuine footnotes verbose, complex
      conceptual problem with parallel nesting tables

Get the files

XML and XSLT files illustrating the "relative footnotes" approach can be found here. This zip file contains following files:

  1. notetest.xml: a sample XML file containing 6 cases with reference solutions
  2. notetest2.xml: a less carefully compiled sample XML file
  3. notetest.xsl: an XSLT stylesheet demonstrating the "relative footnotes" technique

The XSLT stylesheet will only apply the "relative footnotes" technique to tables or lists with footnotes, but currently doesn’t apply any internal optimisation to the latter (as suggested above).

In the XSLT stylesheet, the sub-optimal treatment of parallel nesting tables is kept as a kind of stub for further work.

Any comments are much appreciated!

4 Responses to Rendering footnotes in tables and lists with FOP

  1. This is an interesting workaround, however it won’t work if the table or list-block extends past one page. If I am wrong, let me know. I am looking at implementing markers as a work-around.

    • rvdb says:

      No, I think it works even then: both test files (notetest1.xml and notetest2.xml) contain examples, I believe. Could you expand on what you believe goes wrong? If you have documentation about your markers-approach, I’d be interested to read about it (or link from my post).

      Nevertheless, if you have the chance: do update to FOP-1.0, which has fixed this issue of disappearing footnotes that triggered my post.

      • Hi, I spent a few hours working on the marker approach. I got it working, except for one issue. If there is more than one footnote on the page, the text bleeds into each other on the footnote. So I scraped that idea. I think your work-around will work for list items, but tables are my big issue. Since neither 0.95 and 1.0 do not support table-continued, I put almost all my paras in a table and use a system of markers to display “Para Title – Continued”, which is an output requirement.

        As for updating to 1.0: My project is huge, which was originally updated from 0.20. However, updating is not out-of-the question. But when I tried running my footnotes in my tables, they still didn’t work in 1.0 either. I will give your sample a try and see if I can repo the problem. Thx

  2. Ok, so I ran some test with your examples. I like your approach. I have a number of tricks I use in FOP to get my output. Unfortunately how I render my tables with markers would not allow me to use this approach.I see you output a table for each row. This would throw off my markers :(. Nevertheless, an excellent work-around for others.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: