Practical Interactivity and Shaping the Future of EPUB
by Keith Fahlgren
The IDPF kicked off the next revision of EPUB with two days of face-to-face meetings in New York last week. I came away from the (lively, well-attended) meetings feeling very optimistic about the work ahead of us, as there was a humbling range of backgrounds and experience present in the room. That said, many of the fourteen Industry Problems
that the Working Group is chartered with ameliorating (not solving forever for everyone) will present real challenges.
With the work on those challenges just starting, I was pleased to see that Joseph Pearson, the creator of the Monocle EPUB reader, had taken the time to start writing his thoughts about the challenges of solving the interactivity Problem
. Joseph raises three concerns surrounding the work on interactivity: a need for scripting security, the lack of a defined interaction model, and the danger of document modification by reading systems. While I think that he raises two real problems that will need pragmatic solutions (the first two), we can acknowledge either problem and still get started on defining & supporting more interactive content in EPUB.
In particular, it’s seems unnecessarily pessimistic to point to the current limitations of JavaScript’s security model as an insurmountable issue. Reading systems like iBooks, which hosts its own customized setup of the WebKit rendering engine, already enable a secure, sandboxed JavaScript execution environment. If the IDPF and the Working Group comes up with concrete use cases and requirements around scripting and security, those concrete needs may be the motivation that browser-makers require to get started on aligning broader emerging web standards with the needs of ereaders, especially browser-based readers like Ibis Reader and Monocle. Existing work like Eli Grey’s JSandbox is promising in this regard.
As for the issue of defining the interface & events that an interactive EPUB can hook into: yes, we’ll need that. It’s clear that a basic interaction model should be included in the new specification, but that doesn’t seem like an unrealistic goal at this time.
I don’t understand Joseph’s concern with document modification (as a major problem). EPUB Reading Systems that enable interactivity will need to be careful about keeping the EPUB internals consistent-enough to be usable. It will also be challenging to ensure that content is structured and provided inside interactive content to ensure accessibility for all readers. With luck, work from the WAI-ARIA folks will help guide us.
I’d like to explicitly encourage Joseph and anyone interested in contributing to the future of EPUB (and digital reading in general) to join the IDPF and contribute to the Working Group. In particular, I’d love to hear more voices from people creating digital content, especially outside of North America.
Explicitly: This is purely my own opinion. Liza has her own more informed and interesting thoughts on the topic and she’ll be actively participating in the Interactivity Subgroup and the Working Group in general.
Comments
A very quick and interesting response, Keith.
Let me trade volleys on a few points:
Well like you, I am a web developer by trade. This experience has trained me out of waiting for anything from standards bodies. I don’t think this tiny niche, ereading, is or should be on the radar of any browser maker (although obviously I’m curious about the Editions/Chrome future).
More than that, I just don’t know what a compromise on JavaScript would look like. What would it look like? JSandbox is based on web workers, and web workers have no access to the DOM. I don’t see how it gets us anywhere.
As far as I can see, all of JavaScript is dangerous. “while (true) {}” is dangerous. “window.open()” is dangerous. At least with malformed EPUBs, HTML or CSS I can handle and route around problems.
When there’s a viable JS security solution, of course I’ll be at the head of the queue. But what do we do in the meantime?
Hmm. I had a very basic but functional reading system before I even picked up the EPUB spec. If that spec had prescribed an interaction model, I would have rolled my eyes and put it aside.
Decreeing even a basic interaction model would lock EPUB-compliant readers into the contemporary set of UI practices that human beings, potential readers of digital books, have so far mostly rejected. Innovation in ereader interaction models must be unhindered, because so far it’s all pretty bad.
There’s an alternative which I’d like to write up one day, that sits happily outside any spec, because it’s not EPUB-specific. That’s the idea of a programmable reading system.
I don’t want to discourage publisher innovation — at least, not much. But I would like to find the right place for it.
Good point. Some time ago I put it to my business partner that whenever we make our first $1000 from this ebooks lark, we might blow it all on an IDPF membership. Or an iPad. We’ll see, I guess — so far she tells me we haven’t made a cent. And the IDPF appropriately takes a dim view of hobbyists. Not to worry, one can good-naturedly snipe from the sidelines!
Joseph,
I’d like to see the IDPF dues scale be more accommodating to small businesses, and will be working on that in my capacity as a Board member. No promises though!
joseph said:
> I had a very basic but functional reading system
> before I even picked up the EPUB spec.
> If that spec had prescribed an interaction model,
> I would have rolled my eyes and put it aside.
bingo. that’s a real programmer talking there…
-bowerbird
You’re right, of course, that we need to be realistic about exactly how dependent EPUB 2.1 should be on the innovations of browser manufacturers. I am quite personally curious how much cross-pollination there will be between iBooks & WebKit and Google Editions & WebKit . The Apple folks at the meetings in New York were quite clear that they were prepared to communicate with the WebKit team directly about the ongoing EPUB work (no promises, of course). I’m not expecting too much, but it is a better channel than any other ereader maker has ever had….
I was only trying to suggest that there’s current experimentation
I searched through a few technical thesauri before settling with “interaction model,” but it doesn’t really capture what we’re both thinking of. There haven’t yet been any concrete proposals on this topic, so I can only go off of the internal discussions Liza and I have had and the casual discussions with others.
A defined set of events for hooking onto?
A vague notion of some sort of
interface(in the least? repellent sense?)?An interface sounds distinctly repellent… but yes, I see what you mean.
You know, I’m not sure our thinking is all that dissimilar. My idea of a programmable reader is this, in short:
* no scripting whatsoever in the document
* semantic markup conventions for interactive structures/content – a table, a footnote, a caption, a glossary, a book-specific term, a blank value to be filled, etc
* reading systems that can recognise these conventions and render them in the best way possible to the user — within the constraints of their own interaction model
* ideally, reading systems that can be extended to recognise conventions local to a particular imprint, or even a particular book
If this was 2005, we might be using language like “microformats” and “unobtrusive”. Indeed, imagine if a publisher marked up a reference to Central Park, NYC with the Geo[1] microformat. Ibis Reader for iPad might choose to display a Google map within a popover. Ibis Reader for iPhone might display a full-screen Google map when the reference is tapped. iBooks might use, I dunno, Yahoo maps or something. Kobo Reader might show nothing more than the text itself.
Or differently: imagine if Tor took the clown-like Kevin Rose[2] seriously regarding his “Character Info” proposal. They could develop their own little microformat for that. Then they could embed an instance of Monocle behind a paywall, and write a few lines of code to inspect the DOM of components on load, adding click event handlers to character names in the content.
Other reading systems might provide specific support for the Tor Character Info markup, or they might provide a plugin system, or they might just ignore it. But they control their security, their interaction models and they retain the freedom to modify the DOM as much or as little as they like.
My best example of this, I think, is for tables of contents. In one fell swoop you’d address much of the unwieldy NCX guff and the duplicate ToCs problem.
The value of this is that EPUB remains what it should be: a longevous data format. Components are not executable vessels. Instead, they are quietly marked up using conventions that provide opportunities for now and future reading systems to imaginatively deform them.
[1]: http://microformats.org/wiki/geo
[2]: http://www.youtube.com/watch?v=odQfE48wM_M (Actually I’m still trying to get to the end of that video. Keep pausing it to chuckle or decry.)
Note: I’m not speaking here in my usual online role as an interactive media and ebook researcher + enthusiast but from the perspective of my dayjob, where I work for an antivirus/anti-malware software company. Not speaking as a company representative but as somebody who’s been spending quite some time catching up on what our researchers have been writing on javascript.
There has been a lot of research, a lot of talk and a lot of thinking on the basic insecurities of javascript, namely that the problem doesn’t really lie in the javascript but in the basic, fundamental definition of the DOM.
You can’t make a user safe from js that has full access to the DOM. This isn’t as much of a problem online as you’d think, because rolling out updates to a site and to the client is simple, but it is a major issue with something like an ebook.
There are only three ways of tackling the insecurity of javascript:
1. Sandbox and limit access to the DOM (jsandbox).
2. Enforce a safe subset of the language (caja and ADsafe).
3. Ban all network access of any kind (use js purely for developing client-side behaviour and interactivity).
There aren’t any other ways of using js safely, especially when it comes to persistently installed software like ebooks.
[...] the restrictions that Baldur Bjarnason suggested might allow for safe JavaScript use in ebooks in a recent comment on our post Practical Interactivity and Shaping the Future of [...]
What are the prospects for the EPUB standard shifting from XHTML to HTML 5, an emerging standard itself?
It seems fairly likely that EPUB 3.0 will include at least some amount of what is in the (huge) HTML5 spec, but it’s not sure exactly how much. Part of that will depend on how much HTML5 matures before EPUB 3.0 needs to be finalized late this year.