Threepress Consulting blog

Threepress creates software for publishers, educators and authors.

Category: tools

EPUBGen from Adobe now a part of epub-tools

Want to convert from Word to ePub? These tools aren’t a magic bullet, but they should be helpful to digital publishing developers.
Paul Norton from Adobe has contributed a new software suite to the epub-tools project hosted on Google Code. The Java library provides tools to convert from an impressive number of formats: [...]

docbook2epub 1.0.1 released

Two updates to the docbook2epub tool that converts DocBook into ePub:

docbook2epub is now installable with setuptools.
HTML entities in DocBook source are supported

Direct download of version 1.0.1.

Introducing epubjs

To celebrate TOC I’m announcing an early prototype of epubjs: a pure Javascript ePub reader. The entire application is only 11K (plus 53K for jQuery 1.3).
This is a pretty rough release, still very messy, but I’m hoping to evolve it into a lightweight reader that authors or publishers could add to  their websites with minimal [...]

epubcheck service updated to epubcheck 1.0.3

This is actually a significant update to the threepress.org epubcheck validation service as flaws in the previous versions of Adobe’s epubcheck were causing several critical file types to be completely unvalidated.
I recommend anyone relying on epubcheck, whether as a standalone library or through the web form, revalidate at least a subset of your ePubs.

tei2epub 1.0 release candidate, and docbook2epub released

Two updates to the epub-tools Python code libraries:

tei2epub has been updated to version 1.0b2 as a release candidate.
docbook2epub has been released

The two libraries share common code which is automatically included in any ZIP bundle to handle general ePub tasks, including packing the ZIP file correctly. This required some significant updates to tei2epub, which no longer [...]

Python and XML (and Google!) in publishing applications

IBM DeveloperWorks has just released an article of mine on High-Performance XML Parsing in Python.  Although there is nothing publishing-centric about the article itself, it was based on my own experience in dealing with large XML datasets in academic publishing.

Massive XML files are uncommon in the general web development world, where the primary roles of [...]

TEI + Python + lxml + Dutch = Corpus Toneelkritiek Interbellum

I was pleased to be able to assist with the Corpus Toneelkritiek Interbellum project, which allows reading, browsing and searching of early 20th-century Dutch theater reviews. I can’t read Dutch, but Google’s automated translation tells me that the review of Hamlet mentions a “long modern clown,” which sounds disturbing enough that I’ll leave the [...]

epubcheck service now updated to use version 1.0.0

The web-based method to validate ePub files against epubcheck has been updated to use epubcheck version 1.0.0, the first official release.
Additionally, the tei2epub library which bundles epubcheck has also been updated to include 1.0.0.

Better technical book reviewing with Subversion

I just received my copy of Python for Unix and Linux System Administration by Noah Gift and Jeremy Jones, for which I was a technical reviewer. I’ve done several tech reviews for O’Reilly in the past, on both Python and CSS, and the least enjoyable part of the process has been the actual method [...]

Updates to epubcheck web service

The web-based method to validate ePub files against epubcheck has been updated in a number of ways:

The underlying version of epubcheck has been upgraded to 1.0 RC
The timeout for large file uploads has been increased to 5 minutes, to allow for processing of larger files
A bug has been fixed which prevented the service from displaying [...]