Threepress Consulting blog

Threepress creates software for publishers, educators and authors.

Tag: tei

The real Internet Archive

My attention was caught by this quote from Clay Shirky on the excellent ReadWriteWeb blog:
Back in 1974, when the Internet was a fraction of what it is now, the acorn to an oak, there were really only two applications,” said Shirky, “Telnet, and FTP.”
Surely he’s wrong, I thought.  Those protocols aren’t that old.
But I was [...]

TEI + Python + lxml + Dutch = Corpus Toneelkritiek Interbellum

I was pleased to be able to assist with the Corpus Toneelkritiek Interbellum project, which allows reading, browsing and searching of early 20th-century Dutch theater reviews. I can’t read Dutch, but Google’s automated translation tells me that the review of Hamlet mentions a “long modern clown,” which sounds disturbing enough that I’ll leave the [...]

Seven new books added

The last set of Gutenberg HTML books that were planned for demonstration on threepress have been added.  As usual, data-loading took more time and uncovered up more problems than expected, which is always a reason to add as many samples as possible.  This set includes one non-fiction book (On the Origin of Species) and one [...]

Convert TEI to epub

The most useful standalone tool in threepress right now is tei2epub, which the system uses to convert its internal source XML to the emerging e-book standard format epub.
TEI is the Text Encoding Initiative, and is one of the most popular markup formats for printed works (especially in academics). All of the content on threepress [...]