Building a better web-based book: talk at TOC

Wednesday, December 10th, 2008

I’ll be presenting as part of a panel at the 2009 Tools of Change conference:

CJ Rayhill (Safari Books Online), Travis Alber (BookGlutton), Aaron Miller (BookGlutton.com), Liza Daly (Threepress Consulting Inc.), Ben Vershbow (Institute for the Future of the Book), Dave Gray (XPLANE)

Developing a digital reading interface raises social, aesthetic and technical challenges. Panelists in this session will talk about interface development (web-based vs. client-based), technical decisions, community requirements and intellectual property issues.

Because it’s a panel I’m not sure yet where the discussion will go, but I think the mix of designers and technologists should provide for a good talk.

Slides from “What publishers need to know about digitization”

Thursday, November 13th, 2008

O’Reilly Media will be posting a complete recording of the presentation, but in the meantime I’ve posted the slides from the webcast, “What publishers need to know about digitization” on Slideshare.

Thanks to everyone who attended and especially to those who asked so many excellent questions.

The analog hole, and a seminar on digitization

Thursday, October 23rd, 2008

Over on Tools of Change there’s a post of mine discussing the so-called “analog hole” as it applies to digital  books.  It was a fun article to write, especially the hands-on part.  I used Google’s OCRopus open-source OCR software, which was a little impenetrable to someone outside of the machine-learning community but did a good job once I fumbled around with it for awhile.

Also on that page at the moment is a giant photo of my head advertising What Publishers Need to Know About Digitization, a web seminar I’ll be hosting with O’Reilly Media on November 12. It will be a very high-level, introductory overview aimed at non-technical staff in publishing who are considering a digitization project.

Going full-circle, I wonder if there would be interest in a simple web-based OCR service where publishers could upload a scanned document to see how well bare-bones OCR performed on an image-only PDF or JPEG scan. I imagine it might help predict the complexity of a digitization project, and understand some of the challenges inherent in the process.