epubcheck service updated to epubcheck 1.0.3

Wednesday, November 26th, 2008

This is actually a significant update to the threepress.org epubcheck validation service as flaws in the previous versions of Adobe’s epubcheck were causing several critical file types to be completely unvalidated.

I recommend anyone relying on epubcheck, whether as a standalone library or through the web form, revalidate at least a subset of your ePubs.

tei2epub 1.0 release candidate, and docbook2epub released

Monday, November 17th, 2008

Two updates to the epub-tools Python code libraries:

  1. tei2epub has been updated to version 1.0b2 as a release candidate.
  2. docbook2epub has been released

The two libraries share common code which is automatically included in any ZIP bundle to handle general ePub tasks, including packing the ZIP file correctly. This required some significant updates to tei2epub, which no longer includes the epubcheck Java library. If the library is downloaded separately then both applications will perform validation after the ePub is built.

docbook2epub is very similar to the db2epub Ruby application which is included with the DocBook XSL. It doesn’t offer any significant features over db2epub; it just happens to be in Python.

Threepress is now an IDPF member

Friday, November 14th, 2008

I’m very pleased to announce that Threepress Consulting Inc. is now an official member of the International Digital Publishing Forum.

I look forward to supporting and participating in the further development of the ePub standard.

Slides from “What publishers need to know about digitization”

Thursday, November 13th, 2008

O’Reilly Media will be posting a complete recording of the presentation, but in the meantime I’ve posted the slides from the webcast, “What publishers need to know about digitization” on Slideshare.

Thanks to everyone who attended and especially to those who asked so many excellent questions.

ePub production growing fast

Friday, November 7th, 2008

More extrapolation from usage statistics on the threepress.org ePub validation service, which uses Adobe’s epubcheck:

This report, current as of today, tracks visits to the validator. Blue represents visits in the current month; green is the comparison with the previous month.

When I segment by country I get some interesting new results:

India: +115%
US: +66%
Russia: +2,500%
UK: +52%
Canada: +155%
Germany: -25%
New Zealand: +100%
Ukraine: +500%
France: -25%
Philippines: +100%

I’ve highlighted those countries which are essentially new to the report. (The absolute numbers here are still very low by web traffic standards, so a 2,500% gain is not as huge as it sounds.)

India, of course, continues to be the major user of the validator. Nevertheless, it’s nice to see ePubs starting to come out of new places.

Bookworm now has full-text search and DTBook support

Wednesday, November 5th, 2008

ePubs added to Bookworm are now fully searchable.

When you add a book to your library, its text is automatically scanned and indexed in the correct language. You can search across all of your books from anywhere in the site.

Bookworm search

Results are returned in relevance order. Bookworm supports many advanced search features, such as stemming and boolean operators, through the use of the Xapian open-source search engine.

Bookworm results

More about Bookworm’s full-text ePub search.

DTBook support

DAISY logoePubs that use DTBook rather than XHTML can now be viewed and searched just like XHTML ePubs. DTBook ePubs are automatically converted to XHTML using the DAISY pipeline. The original ePub can always be downloaded with its DTBook content intact.

More about Bookworm’s ePub support.