Secrets of ePub: Out-of-line XML islands and fallbacks

by Liza Daly

This is the first in a series of posts describing some of the lesser-known features in the ePub spec. Many of them aren’t implemented by any readers, yet, but can provide some interesting functionality.

The first is “fallbacks.” As ePub developers know, only a small handful of content types are supported by the spec: XHTML, DTBook, SVG, and a few binary image formats. Missing from this list are any kinds of video or audio, as well as other types of XML. This restriction leads to the assumption that ePub books can’t support these features.

In fact, any type of content can be included in an ePub, including both video and audio, as long as “fallback” is included. A fallback is an equivalent representation of that content in one of the supported formats.

Even novice HTML authors are familiar with the fallback concept — the same idea is expressed in the IMG element with “alt” text. Just like the alt attribute specifies some textual representation of the image, a fallback in ePub expresses the content in a native format for users who can’t view the real thing.

A common request of educational publishers is to be able to include MathML, a type of XML, in mathematics books. MathML is not directly supported in ePub (though it is likely be added in the future). However, it is possible to include a reference to a MathML file in the ePub book, perhaps as an “examples” page, with a fallback to an XHTML equivalent (perhaps one which includes static image examples).

Here’s some sample MathML:

    <math>
      <apply>
        <minus/>
        <apply>
          <times/>
          <ci>x</ci>
          <apply>
            <plus/>
            <apply>
              <divide/>
              <ci>a</ci>
              <ci>b</ci>
            </apply>
            <ci>c</ci>
          </apply>
        </apply>
        <cn>1</cn>
      </apply>
    </math>

In a MathML-capable rendering system this will appear as:

Screen shot 2009-11-09 at 11.09.00 AM

In the OPF you declare the MathML and the fallback as:

    <!-- MathML -->
    <item id="math"
          href="math.xml"
          fallback="math-fallback"
          media-type="application/mathml+xml"/>

    <!-- XHTML 1.1 fallback -->
    <item id="math-fallback"
          href="fallback.html"
          media-type="application/xhtml+xml" />

  </manifest>
  <spine toc="ncx">
    <itemref idref="math"/>
  </spine>

The specification is unclear about how this should be referenced in the NCX file; logically I should be able to point to the main document and let the reading system handle the fallback:

    <navPoint id="navpoint-1" playOrder="1">
      <navLabel>
        <text>MathML example</text>
      </navLabel>
      <content src="math.xml"/>
    </navPoint>

…but this doesn’t validate in epubcheck:

ERROR: ../math.epub/OEBPS/toc.ncx(18): hyperlink to non-standard resource ‘OEBPS/math.xml’ of type ‘application/mathml+xml’

I’m not sure if I’m declaring it wrong or if epubcheck should allow this case.

Note that the above will not work in Bookworm, as it doesn’t think it understands MathML (in fact it could render the above in Firefox). Adobe Digital Editions displays nothing at all — it can’t render the MathML, but also doesn’t regress to the fallback.

The above is an example of fallbacks used in Out-of-line XML Islands, but it is also possible to embed external formats as Inline XML islands.