Threepress Consulting blog

Threepress job opening: work on Ibis Reader

by Liza Daly

(Update August 1, 2011: This position has been filled but we are always interested in hearing from applicants who meet the requirements below and are interested!)

Threepress Consulting has an immediate opening for a full-time Python and JavaScript software engineer. You will be joining our team to develop our HTML5 ereading platform, Ibis Reader. You will work remotely, though special consideration will be given to candidates in Massachusetts or California.

Ibis Reader was launched in February, 2010 and provides offline ebook reading and cloud syncing on HTML5-capable mobile devices and any web-connected computer or smartphone. The code base is being actively licensed by other parties and we need to grow the team to continue to enhance the platform.

Please send a resume, code sample (Python and/or JavaScript, preferably on a public repo like GitHub) and salary requirements when applying. US citizens only, no recruiters.

Requirements

  1. Strong Python experience
  2. Django
  3. jQuery
  4. Selenium + Nose / a strong committment to testing
  5. Mercurial
  6. HTML5 / CSS3 (familiarity)
  7. Familiarity with XML, especially via lxml and/or XSLT

Enthusiasm about digital reading, electronic literature, text-based gaming, library technology, and mobile application development are all pluses.

About the company

Threepress is a profitable software consultancy based in Somerville, MA providing services to the digital publishing industry. We are active contributors to several open source projects. Threepress offers competitive salaries, retirement, health care, and an engineering-focused culture.

Contact Info:

Email contact: jobs@threepress.org

EPUB 3 Navigation Document support in Ibis Reader

by Liza Daly

If you’re interested in diving into EPUB 3 right now, Dave Cramer from Hachette Book Group was kind enough to put together a sample EPUB version of Moby Dick featuring an EPUB 3-compatible table of contents (the “EPUB Navigation Document“), some example of the new metadata, and a media-overlay sample of synced audio and text.

This particular EPUB 3 document is designed to be backwards-compatible, so it will be readable in most EPUB 2 reading systems as well (even ADE!) — though the EPUB 3-only magic requires full reading system support. I suspect we’ll see a lot of these hybrid documents in the ecosystem, and the EPUB Working Group should be commended for making backwards compatibility a core design goal.

Dave’s version of the book will work in the public Ibis Reader because it’s backwards-compatible, but after putting in some support for the new EPUB Navigation Document (END) format, I wanted to be absolutely sure that Ibis could read the new format and not fall back to the NCX.

EPUB 3 document with NCX/EPUB 2 fallback

This is how the hybrid book declares both the new navigational document (“END”) and the old NCX. If you were generating EPUB programmatically, it should be trivial to produce both EPUB navigation documents from the same source.

<manifest>
    <!-- The EPUB Navigational Document has a @property with the value "nav" -->
    <item id="toc" properties="nav" href="toc.xhtml" media-type="application/xhtml+xml"/>

    <!-- The EPUB 2 NCX can also be included for backwards-compatibility -->
    <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
     ...
</manifest>

<!-- Backwards compatibility also requires the @toc attribute on the spine -->
<spine toc="ncx">
...

EPUB 3 without any EPUB 2 fallbacks

In an EPUB 3-only document you can simply have:

<manifest>
    <!-- The EPUB Navigational Document has a @property with the value "nav" -->
    <item id="toc" properties="nav" href="toc.xhtml" media-type="application/xhtml+xml"/>
     ...
</manifest>
<spine>
...

I took Dave’s document and removed all the NCX references for testing purposes. Now even without the NCX, our internal EPUB 3 development branch can still read the book’s table of contents:

Screenshot of a development version of Ibis Reader showing an EPUB 3 document

It will take some time before we can publicly release EPUB 3 support, but it’s coming!

Welcome Chuck Ha!

by Liza Daly

We’re very happy to have another talented software engineer on board.

Chuck Ha is a Software Engineer with Threepress. He has been building web applications for several years across a variety of platforms and languages, though his favorites include JavaScript and Python. He strives for elegant and well-tested code. Chuck enjoys contributing to open source projects, solving problems, and anything related to the ocean, including how it tastes

One of our goals for this summer is to rapidly add EPUB 3 support to Ibis Reader; Chuck is helping us clear out some old bugs and then will move on to adding some of the interesting new features of the next revision of the EPUB spec.

Transforming NCX into EPUB 3 Navigation Documents

by Keith Fahlgren

The first version of EPUB used the NCX format to describe an accessible, machine-readable Table of Contents. NCX had come to EPUB from DAISY’s DTBook standard and was a crucial navigational aide. Unfortunately, NCX was rarely understood and is not very human-readable. As part of the alignment with wider web standards, EPUB 3 has dropped the NCX format and encodes the same information in a specialization of XHTML, the EPUB Navigation Document.

Moving away from specialized ebook-only solutions was a big part of EPUB 3, so I am quite interested to see what these new EPUB Navigation Documents look like in the real world. It seemed like the easiest way to create a lot of them was to transform NCX files into the new format, so I’ve written an open-source (BSD) stylesheet to do just that:

ncx2end-0.1.xsl

Note: This ncx2end-0.1.xsl is alpha-quality software in the worst possible way—it probably won’t work correctly on your documents and is hard to use. It does produce apparently-valid output for the 100+ NCX files I had around, but I would not put much faith in that today. Enjoy!

If you find this tool useful enough to discover an error, please submit a bug report and make sure to attach your NCX file to improve the test suite.

Read the rest of this entry »

Validating EPUB 3 experiments

by Keith Fahlgren

EPUB 3 is tricky to experiment with today. Like any brand-new specification, there aren’t many of the resources we often take for granted, from books to software to validation tools. However, if you’re already comfortable getting your hands dirty you can get meaningful validation for your EPUB 3 documents now. In the future, we’ll probably have a dedicated EPUB 3 validation tool (modeled somewhat on epubcheck, although with quite a few changes, I hope), but I’d like to start working today. This post outlines how.

Note: I’m going to give examples using a number of bare-metal tools available on Mac OS X. These are probably portable to Linux and even Windows if you were motivated, but I’m not going to explain how to install them or set them up (here or in the comments). Google is your friend.

Read the rest of this entry »

IDPF Digital Book/BEA 2011: Highly-Accessible Interactive EPUB

by Liza Daly

Slides from my talk on creating accessible interactive ebooks with EPUB 3 are available:

Read the rest of this entry »

Copy Editors in EPUB 3

by Keith Fahlgren

The new flexibility in metadata for EPUB 3 is a strength. Instead of locking down the set of permitted metadata schemes and schemas (like EPUB), it allows us to use a range of definitions. While this flexibility comes with some dangers (see the discussion in the comments here about what to do when there isn’t yet a good external definition), I’m generally not worried about giving the masses power tools (most folks are smarter than you think).

In particular, I’ve always been annoyed that in EPUB’s metadata I could not give proper, specific credit to the many people that make a contribution to a title, even if they are not on the cover. Here’s the cover of Responsive Web Design:

cover of Responsive Web Design

but here’s the set of contributors in the EPUB’s metadata:

<dc:creator opf:file-as="Marcotte, Ethan">Ethan Marcotte</dc:creator>
<dc:contributor opf:file-as="Zeldman, Jeffrey"
                opf:role="pbl">Jeffrey Zeldman</dc:contributor>
<dc:contributor opf:file-as="Santa Maria, Jason"
                opf:role="bkd">Jason Santa Maria</dc:contributor>
<dc:contributor opf:file-as="Brown, Mandy"
                opf:role="edt">Mandy Brown</dc:contributor>
<dc:contributor opf:file-as="Cederholm, Dan"
                opf:role="edt">Dan Cederholm</dc:contributor>
<dc:contributor opf:file-as="Stevens, Krista"
                opf:role="edt">Krista Stevens</dc:contributor>
<dc:contributor opf:file-as="Egan, Neil"
                opf:role="cmt">Neil Egan</dc:contributor>
<dc:contributor opf:role="mrk">Threepress Consulting Inc.</dc:contributor>

As you can see, there are a lot of people credited in the metadata, and we’ve credited each one as a Dublin Core contributor with a reference to a MARC Relator code. Specifying these contributors using these external definitions (both of which have huge amounts of both consensus and uptake behind them) means that a lot of software can understand it mechanically (including software and people outside of the US or English-speaking world, with some limitations). However, consensus sometimes leaves us impoverished: in this case there are three people all labeled as Editor, but the copyright page from the book itself shows us their true roles:

Publisher: Jeffrey Zeldman
Designer: Jason Santa Maria
Editor: Mandy Brown
Technical Editor: Dan Cederholm
Copyeditor: Krista Stevens
Compositor: Neil Egan

With the ability in EPUB 3 to reference a different source for these editor definitions, we can actually differentiate Dan Cederholm’s role as Technical Editor and Krista Stevens’ as Copy Editor:

<meta property="dcterms:contributor"
      id="technicaleditor">Dan Cederholm</meta>
<meta about="#technicaleditor"
      property="file-as">Cederholm, Dan</meta>
<meta about="#technicaleditor"
      property="role"
      id="technicaleditor-role">technicaleditor</meta>
<meta about="#technicaleditor-role"
      property="scheme"
      datatype="xsd:anyURI">http://www.docbook.org/xml/5.0/rng/docbook.rng</meta>

<meta property="dcterms:contributor"
      id="copyeditor">Krista Stevens</meta>
<meta about="#copyeditor"
      property="file-as">Stevens, Krista</meta>
<meta about="#copyeditor"
      property="role"
      id="copyeditor-role">copyeditor</meta>
<meta about="#copyeditor-role"
      property="scheme"
      datatype="xsd:anyURI">http://www.docbook.org/xml/5.0/rng/docbook.rng</meta>

In the above example, we’ve described a few different things about these folks using the new EPUB 3 metadata syntax. First, we’ve referred to them as a contributor:

<meta property="dcterms:contributor"
      id="technicaleditor">Dan Cederholm</meta>

Next, we’ve specified how to sort their names with a file-as property:

<meta about="#technicaleditor"
      property="file-as">Cederholm, Dan</meta>

Finally, we’ve accomplished our goal of saying that Dan Cederholm’s contribution role was as a Technical Editor, as defined by DocBook 5.0:

<meta about="#technicaleditor"
      property="role"
      id="technicaleditor-role">technicaleditor</meta>
<meta about="#technicaleditor-role"
      property="scheme"
      datatype="xsd:anyURI">http://www.docbook.org/xml/5.0/rng/docbook.rng</meta>

For some reason, Copy Editors and Technical Editors have long been defined by DocBook, but are omitted from the ONIX Contributor codes, the MARC Relator codes, the PRISM codes, the Z39.86-201x Structural Semantics Vocabulary and every other metadata standard I could find.


Note/Warning: If you were going to make an EPUB 3 title today, you’d probably include both the “old” style EPUB metadata for these contributors AND the new EPUB 3 metadata (with the prefer attribute for each set). This would give you the benefit of both backward-compatibility with existing EPUB reading systems and better metadata for the future when EPUB 3 reading systems are more commonplace. Eventually you may just need the EPUB 3 version, and that’s the world I’m talking about here (partially as a small attempt to get us there faster).

Can an author create an EPUB using normal tools? Part 2: Scrivener

by Liza Daly

This is part two of a series on using author-friendly word-processing tools with native EPUB export. Part 1 was about Apple Pages.

Part two: Scrivener

I first became aware of Scrivener’s support for EPUB export in November of 2010, when they released a special edition for NaNoWriMo (and the less said about my novel-writing experiment, the better). EPUB export is now available in the latest major release.

Though Scrivener is primarily a Mac application, there are beta versions for Windows and Linux. I have no idea if they support EPUB export — please comment if you know either way!

The sample document

Scrivener’s interface can be intimidating if you have only worked with relatively straightforward word processors. It’s less of a document editor than a book authoring platform (or the way I would look at it, it’s Visual Studio, not emacs).

Screenshot of Scrivener main screen with numerous panels

For this test I used the “Manuscript” style, which provides many of the utilities that an author of a novel-length book might want, rather than starting with an entirely blank document (analogous to using the EPUB sample document in Pages). For one thing, the Manuscript layout already provides obvious support for structured blocks like chapters and parts. In fact Scrivener segments documents even further into “scenes” which can be relocated easily through the document. Programmers love structured content!

As in the previous post, I set up the sample document to include all of the obvious stylistic elements that I wanted to test. I didn’t attempt to exactly re-create the Pages sample as that would be soul-crushing.

Finding the EPUB export feature can be challenging; it’s actually listed as “Compile.”

Export options

No complaints here. There are dozens of options for EPUB configuration from the menus.

EPUB-specific output options:

Scrivener panel

Select which content segments are included in the book, and in what order:

Scrivener panel

My favorite pane. Look at all that metadata!

Scrivener panel

Output

Scrivener’s output is variable in quality. Some things are nice. It’s valid, for one thing, and I was able to produce this valid output without any special handling — I didn’t change any of the menu items beyond adding the metadata, which indeed trickles down nicely into the OPF file:

       <dc:title>Creating an EPUB sample document</dc:title>
        <dc:identifier id="PrimaryID">urn:uuid:A54DCD13-A6E0-4ACA-8B31-4E0EBA0624EF</dc:identifier>
        <dc:language>en</dc:language>
        <dc:creator opf:role="aut">Liza Daly</dc:creator>
        <dc:subject>Non-fiction</dc:subject>
        <dc:description>An EPUB test of Scrivener's output</dc:description>
        <dc:publisher>Threepress Consulting Inc.</dc:publisher>
        <dc:rights>http://creativecommons.org/licenses/by-sa/2.0/deed.en</dc:rights>
        <dc:date>2011-05-29</dc:date>

Whitespace is exported as significant. While this makes the web developer in me cringe (I’d rather see CSS margins used here), it’s acceptable, especially since whitespace is semantically relevant in fiction, as when used for scene breaks:

<p class="p1"><br /></p>
<p class="p1"><br /></p>
<p class="p2">Hello world!</p>
<p class="p2">I have lists and tables!</p>

Unlike Pages, lists are generated as the correct list elements:

<ol class="ol1">
  <li class="li3">Numbered lists are numerous.</li>
  <li class="li3">We often have more than one item.</li>
</ol>
<ul class="ul1">
  <li class="li3">Make love</li>
  <li class="li3">Not bullets</li>
</ul>

Tables are fine, though I’d like for these tools to have a “table header” style that could ultimately map to th elements. This is a little more readable than the Pages output.

<table cellspacing="0" cellpadding="0" class="t1">
  <tbody>
    <tr>
      <td valign="top" class="td1">
        <p class="p2">Not sure how to make headers</p>
      </td>
      <td valign="top" class="td1">
        <p class="p2">Without just using colors</p>
      </td>
      <td valign="top" class="td1">
        <p class="p2">Not semantic</p>
      </td>
    </tr>

Headers are curiously not optimal:

<h2 style="margin: 0.0px 0.0px 0.0px 0.0px; font: 138% Optima"><b>First, a header.</b></h2>
<h3 style="margin: 0.0px 0.0px 0.0px 0.0px; font: 92% Optima"><b>A subheader.</b></h3>

Why the inline margins and font? Why the b? And like Pages, no apparent way to generate strong and em:

<p class="p2">Back to body text. But with <b>strong</b> and <i>emphasis</i>?</p>

Drag-and-drop images and internal and external hyperlinking all worked well:

<p class="p2"><span class="s1"><img src="images/droppedImage.png" width="272" alt="Image" /></span><span class="s2">Hello dog.</span></p>
<p class="p4">Inline URL: <a href="http://www.flickr.com/photos/beinecke_library/5166412915/in/set-72157625240109163/">http://www.flickr.com/photos/beinecke_library/5166412915/in/set-72157625240109163/</a></p>
<p class="p4">A hyperlink to <a href="body1.xhtml">Chapter One.</a> A hyperlink to <a href="http://placekitten.com">a website</a>.</p>

I pulled the image out of the Pages EPUB output – Pages renamed it to droppedImage.png, and Scrivener retained the filename, which is a nice touch.

Room for improvement

  • Scrivener seems to generate one CSS file for each XHTML file too many CSS files by default. Uncool! This could potentially generate huge unwieldy EPUB books. There’s no reason to have more than one CSS file in an automatically-converted document, and the styles should be normalized across XHTML files.
  • Like Pages, there was no obvious way to add alt text values to images.
  • Also like Pages, I’d like to be able to customize the elements and class names in the outputted XHTML using simple menus.

A pleasant surprise to me was that Scrivener has a lot of options for power-users under the hood, including integration with version control tools like Subversion, use of a text-based markup format for better serialization to other formats, or low-level customization of the CSS or even XSLT: Scrivener advanced topics. This means it’s potentially a better choice for an ambitious digital-only publisher; you can start with the basic WYSIWYG layer, and gradually customize the output by diving into the lower level. This will be especially true once Scrivener is fully cross-platform.

Recommended with reservations (but mainly because of the CSS file issue).

Download the sample EPUB file and Scrivener file (unzip first). (Released under a Creative Commons Attribution license)

Can an author create an EPUB file using normal tools? Part 1: Pages

by Liza Daly

Yes, but it may require a Mac.

The IDPF board met on the last day of the Digital Book 2011 conference at Book Expo America. One of our topics for discussion was what the IDPF as an organization should do to further the adoption of EPUB. I brought up an issue that’s been concerning me for some time: the lack of digital-native authoring tools aimed at authors, not publishers.

If publishers are struggling to produce high-quality EPUB files either via InDesign, XML workflows, or strategic outsourcing, authors are in an even worse place. This is especially true for authors with an ambition to self-publish, or to start a micro-publishing outfit, and yet still retain some creative control over the look of their digital product. InDesign (especially CS5.5) is a great solution for small- to medium-sized publishers who produce both print and digital books, but its feature set is inappropriate for digital-native publishing, and its price and complexity are unsuited for self-publishers.

I’m aware of two document creation tools right now that have native EPUB support and are available for my platform, Mac OS X: Pages, and Scrivener. (The only product I know of on the Windows side is Atlantis. Linux users have to make do with plugins for OpenOffice — judging from the comments in the issue tracker, EPUB export is not a priority, to say the least.)

This post will cover Apple Pages. A subsequent post covers outputting EPUB with Scrivener.

Pages

Apple’s Pages was the first major commercial word processor to include EPUB export. I reviewed the initial EPUB support in August 2010, but it’s been through some updates since then, and I wanted to dive into the semantics of the outputted code more closely.

Screenshot of a sample EPUB file in Apple Pages

The sample document

I started with the Apple-provided EPUB template (more on that later) and added a number of new elements and semantic tests. In particular, I added:

  • Chapters and headings
  • Emphasis and strong text (rendered in Pages as italics and bold)
  • Numbered and unnumbered lists
  • Hyperlinks both internal to the document and external to the web
  • Inline images (by dragging and dropping)
  • A cover page with an image
  • All available metadata in the export pane
  • Tables

In all cases, I used only styles available in the style drawer; I did not change any font sizes or font weight via the toolbar buttons.

The EPUB output

As in my previous test, this produced a valid EPUB 2.0.1 document according to EPUBCheck 1.1. Hooray!

Headers and subheaders

The semantics are much-improved from my first test. Paragraphs are now wrapped in <p> elements, for example, and headers are headers:

  <body>
    <div class="body" style="white-space:pre-wrap">
      <h3>Chapter Two: The Chaptering</h3>
      <p class="s2">This chapter has an introduction. Hello!</p>
      <h4>I’m a subchapter or section under that. </h4>
      <p class="s2">Don’t hold it against me. I just have a lot of things to say.</p>
    </div>
  </body>

However, the white-space:pre-wrap style is curious: the property is meant to specify that whitespace inside the XHTML is significant, meaning that the ereader/web browser should retain it. That is emphatically not a best practice in general text; on the other hand, there was no whitespace in the output at all, so I’m unsure of its purpose. If I were post-processing this EPUB file, I would remove that style.

I used the “Chapter” style to generate the chapter heading. This header should be an h1 rather than an h3, but at least the subheading is also a header and one step down.

Go boldly

I completely failed to find a way to output strong and em rather than b or i.

Listless

I used the list styles provided in the template, but these are not the lists you’re looking for:

      <h3>Chapter Three: Lists</h3>
      <p class="s2">Reasons why people love lists, in order.</p>
      <p class="s2 s3"><span class="c2">1.</span>Lists are neat.</p>
      <p class="s2 s3"><span class="c2">2.</span>It’s cool to let the computer fill in numbering.</p>
      <p class="s2 s3"><span class="c2">3.</span>Yessir.</p>
      <p class="s2">Other reasons that people like lists, in no particular order:</p>
      <p class="s2 s4"><span class="c3">•</span>Sometimes they have bullets</p>
      <p class="s2 s4"><span class="c3">•</span>Not real bullets.</p>
      <p class="s2 s4"><span class="c3">•</span>Those are scary.</p>

This must be fixed.

Tables

A little verbose markup-wise, but basically fine:

      <table class="s5" style="margin-left:0.0px;width:99.8%;border-collapse:collapse">
        <col style="width:33.3%"/>
        <col style="width:33.3%"/>
        <col style="width:33.3%"/>
        <tr style="height:25.0%">
          <td class="s8 s6 s7">
            <h2 class="s9">Reasons why tables are nice</h2>
          </td>
          <td class="s8 s6 s7">
            <h2 class="s9">Who feels this way</h2>
          </td>
          <td class="s8 s6 s7">
            <h2 class="s9">I can’t think of a third thing.</h2>
          </td>
        </tr>
        ....

Images, covers, and links

Creating an image is as easy as dragging it in. I’m not sure if it’s possible to add alt text to the image — I believe document creation tools should prompt users to add descriptive text by default.

Page of sample ebook in iBooks showing an image of a dog

      <p class="s2">
        <img src="images/droppedImage.png" alt="droppedImage.png" style=""/>
      </p>

Only images styled as “inline” will be exported; Pages will warn you that the image was discarded if it had a floating or fixed style. I tried to select a page with an inline image as the cover page but Pages gave me a warning that it was being discarded. Then it actually shows up in iBooks anyway.

It would be nice if the original filename were preserved (it was not “droppedImage.png”), and the empty style attribute should be discarded on output.

      <h1><span id="chapter-5-sh1"/>Chapter Five: Hyperlinks</h1>
      <p class="s2">This is an internal link to <a href="chapter-1.xhtml#b1"><span class="c1">chapter one</span></a>. This is an external link to <a href="http://placekitten.com/"><span class="c1">photos of kittens</span></a>.</p>

The empty span here is for the purpose of creating a back-link. A similar one was auto-added to Chapter One. Adding an internal hyperlink requires an initial step of creating a Pages “bookmark”, and then linking to that bookmark, which was a little confusing; I should be able to target any point in the document using the hyperlink feature.

I didn’t test HTML5 video output, but I’ve been told that video can be successfully embedded and output such that the video will work in iBooks (it will use HTML5 tagging).

Metadata

Both the OPF and the NCX were perfectly well-formed. The EPUB export dialog should optionally request richer metadata than the current list of author/title/subject, though.

The dreaded sample document

The EPUB export function is next to useless on large documents unless you start with the sample template, or import its styles later and tediously update yours to match. The EPUB styles are completely opaque — I have no idea why they have magical properties, or what I would do to my own styles to emulate them. Since the Pages native file format is binary, there’s nothing for me to inspect to reverse-engineer the styling. The Pages file format is zipped XML, so it may be possible to inspect it directly — thanks Steve!

The native header/paragraph/list styles in the blank template should output useful semantics in the XHTML. It is unacceptable to force users to import an external document to produce a half-decent EPUB file. At the very least, an EPUB-friendly template should be one of the default choices available when creating a new document.

Improvements

  • The list styles should generate lists. They should be ordered or unordered as appropriate to the style.
  • EPUBs should be importable as well as exportable. It’s understandable that they won’t magically re-constitute into the original Pages document, but a conversion pipeline is entirely reasonable.
  • It should be possible to export chunked EPUBS (with multiple XHTML chapters) without having to use the sample template.
  • It should be possible for a power user to understand how to create styles that will have specific behaviors.
  • It should be possible to customize the XHTML serialization (“I want the style named ’strong’ to output strong elements with the classname ‘foo’”).
  • There should be much more metadata allowable in the OPF file.
  • Images should require or at least prompt for alt attribute values.
  • Bold and italic buttons should output strong and em with the appropriate CSS styling in all cases. I would say this is actually true of any EPUB output tools — it’s unreasonable to ask users to create named styles (as in InDesign) when those tempting bold and italic buttons are available.

I don’t expect Windows/Linux versions of Pages to ever exist, which means that Pages will remain a marginal tool in the publishing ecosystem, but it’s perfectly adequate for an individual Mac-only user.

Download the sample EPUB file and sample Pages document, (released under a Creative Commons Attribution license).

Subtitles in EPUB 3

by Keith Fahlgren

Update: I should have been clearer that the “bare” use of subtitle without referencing a scheme is really suboptimal. Using already-defined properties/definitions and referencing them explicitly (like the ONIX codelist example below) is a much better technique, but I did not come up with a compelling property for subtitle, hence the call for better options. See the comments below for a good discussion of those options.

There’s also a Bulgarian translation.

In a previous post, Thomas Rasche asked:

How are book subtitles best dealt with in Epub3? They could be included with a meta data property, but is there a recommended way so that a reader recognizes them, to display them sensibly (eg subtitles display in smaller font under the main title etc)?

EPUB 3’s metadata model is powerful and flexible. It is a big improvement over the limited metadata permissible in EPUB, but there are three real costs associated with EPUB 3’s metadata:

  1. Because of backward compatibility with EPUB, metadata in EPUB 3 duplicates at least three required metadata elements in both an old and new style: title, identifier, and language (probaby creator too).
  2. Greater flexibility in metadata schemas means that EPUB 3 metadata sometimes seems quite verbose (but you’re generating your OPF using software with robust metadata stores anyway, so this isn’t much of a complaint, right?).
  3. Different metadata schemas describe the same thing with different words, so there are often many ways to accomplish the same thing and we don’t yet know what the convention will be.

That last issue of choosing a schema is the central issue in our subtitle problem. The EPUB Publications 3.0 specification provides a number of examples on how to include specific metadata but doesn’t exactly prescribe how a subtitle should be included. Instead, we’ll include two titles in a row, declare one as the subtitle, and indicate a display order.

Cover of Dreams from My Father: A Story of Race and Inheritance

Here’s the markup, using Dreams from My Father: A Story of Race and Inheritance as the example:

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
  <dc:title id="title" prefer="dcterms-title">Dreams from My Father</dc:title>
  <meta property="dcterms:title" id="dcterms-title">Dreams from My Father</meta>
  <meta about="#dcterms-title" property="display-seq">1</meta>

  <dc:title id="subtitle" prefer="dcterms-subtitle">A Story of Race and Inheritance</dc:title>
  <meta property="dcterms:title" id="dcterms-subtitle">A Story of Race and Inheritance</meta>
  <meta about="#dcterms-subtitle" property="title-type">subtitle</meta>
  <meta about="#dcterms-subtitle" property="display-seq">2</meta>
  …

What’s going on here?

In this markup, we do a range of things. We state both the title and subtitle using the legacy EPUB markup for backward compatibility and we do it in order, which is important for EPUB, but we also tell EPUB 3 reading systems to use the new syntax with the prefer attribute:

  <dc:title id="title"
            prefer="dcterms-title">Dreams from My Father</dc:title>
  …
  <dc:title id="subtitle"
            prefer="dcterms-subtitle">A Story of Race and Inheritance</dc:title>

We also include the titles using the new markup for EPUB 3:

  <meta property="dcterms:title"
        id="dcterms-title">Dreams from My Father</meta>
  …
  <meta property="dcterms:title"
        id="dcterms-subtitle">A Story of Race and Inheritance</meta>

Next we specify that the subtitle, is, in fact, a title-type subtitle. We do this by adding another meta element that refers to the subtitle with the about attribute:

  <meta about="#dcterms-subtitle"
        property="title-type">subtitle</meta>

Finally, we specify the display order of the two titles with meta elements and display-seq

  <meta about="#dcterms-title"
        property="display-seq">1</meta>
  …
  <meta about="#dcterms-subtitle"
        property="display-seq">2</meta>

Other options

EPUB 3 is new enough that we don’t know which practices will become adopted by content creators and reading systems (so we don’t know about a smaller font, like the original question). The subtitle markup suggested above is a decent approach, but other options might be using http://open.vocab.org/terms/subtitle (not widely used in publishing today) or ONIX’s <Subtitle> element (very hard to reference cleanly).

If you’ve got a better suggestion for the markup above please leave a comment and I’ll update the post if a winner emerges.

Colons don’t make subtitles

While it may seem tempting to split title and subtitle on a :, don’t, ever: slide:ology.


If there are specific EPUB 3 questions you have that you’d like to see us write about, please leave a note in the comments—Thanks!


Creative Commons License
Subtitles in EPUB 3 by Keith Fahlgren is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.