Better technical book reviewing with Subversion

Friday, August 29th, 2008

I just received my copy of Python for Unix and Linux System Administration by Noah Gift and Jeremy Jones, for which I was a technical reviewer. I’ve done several tech reviews for O’Reilly in the past, on both Python and CSS, and the least enjoyable part of the process has been the actual method of providing feedback.

At my previous job we routinely used Word (or OpenOffice) with Track Changes for collaborative editing, and as Word-based tools go I felt that worked well. For whatever reason, though, most of the pre-release books I’ve received have been in PDF, which is limiting in several ways:

  1. Cut and paste from PDF, especially of source code, often does not work properly. To test the code a technical reviewer needs to ensure that they are accurately repeating exactly what is in the book.
  2. There is no ability to in-line comment on particular words or phrases.
  3. The copy of the text I’m reading may be days or even weeks out of date, back when the author did the PDF conversion.

Python for Unix and Linux System Adminstration was different: the authors elected to use the source code control system Subversion to manage the writing.  The text was composed in DocBook XML rather than Word or some other non-text format.  While I’m sure this was done entirely to facilitate collaboration between the authors, it had the downstream effect of making it supremely easy for me to review it:

  1. Code samples were in plain text, and if they were formatted incorrectly, that was useful feedback to be able to give (especially in a language that is sensitive to whitespace, as Python is)
  2. While I was told I would be able to “commit” my changes back to the authors inside of the source text, I still chose to use an external file to provide my comments.  I did this only because I wasn’t sure that the authors would be able to manage multiple commits coming in from technical reviewers, and because we hadn’t decided on a common tagging framework.  With more editorial guidance, being able to commit my comments directly into the source could be very useful (including the ability to potentially see other reviewers’ comments, and avoid repeating myself).
  3. Each time I went to work on the book, I was able to get a fresh copy of the text.  I didn’t go back and re-check old sections, but it did mean that any section I worked on was always up-to-date.

When used with friendly front-end software like TortoiseSVN, Subversion isn’t even very difficult.  It’s certainly no more arcane than many professional content management systems.  Although it works best when managing text content (which could be Office-supported XML formats), it would still provide value with binary formats. It’s worth considering for any publisher that has to manage multiple, distributed editors or authors and wants to improve the process using entirely free software.

For more on the subject, Rachel Greenham has a nice tutorial explaining how authors can use Subversion with OS X. The definitive word is the Subversion book.

(I highly recommend Python for Unix and Linux System Adminstration as well, even for Python programmers who aren’t system administrators. It collects an impressive breadth of information in one place and showed me how to automate processes I hadn’t even realized needed automating.)

On TOC: Read anything on the Kindle

Tuesday, August 26th, 2008

As part of an on-going series on exploring the hidden corners of the Kindle, a post on using an undocumented image browsing feature to read complex PDFs or image-based documents:  How to Read any Type of File on the Kindle (Almost).

Of course, going from text to scanned images is exactly backwards from the way things ought to be.

The lazy, social, anti-DRM pattern for digital books

Saturday, August 23rd, 2008

It’s 2am, and I’ve just finished a great novel.  My significant other went to sleep hours ago. My best friend, who lives across the country, would love this book, so I make a mental note to tell him about it.  If we talk in a day or so I might remember, and it’s possible he’ll be intrigued enough to buy it.  If not, maybe in five months I’ll get it for his birthday, if something else hasn’t come along by then.

In all likelihood, this was a one-shot deal: I bought a book, I liked it, I’ll tell a few people, and at best there might be one additional sale.

Now imagine it’s 2am and I’ve read this book on my second-generation networked digital reader, maybe the Kindle 2.0.  As soon as I’ve finished the book, the device prompts me to rate it (4 stars!).  It also knows about my social connections.  It asks me if I’d recommend it to my friend, who has enjoyed similar books, and I say yes.

The next morning my friend wakes up and picks up his e-reader.  There’s a recommendation from me — and a 20% discount to purchase this book immediately. This $5 digital book is now just four bucks, and it’s instantly on his device.

Once he accepts, I get 20% off my next purchase too, and a “karma point” in my profile for a successful recommendation.

Social

People overwhelmingly buy books based on personal recommendations.  Reading is normally a solitary activity; the only way to share the experience of a book is to urge friends to read it too.  It’s curious that Amazon.com has hardly any social component, whereas Netflix (which loses money every time I rent a movie) has a very useful but underpromoted “Friends” area. I rent movies directly off my friends’ queues all the time, but I still buy books from Amazon after speaking with someone or reading anonymous reviews.

The combination of social networking and instant media transmission on devices like the Kindle can revolutionize this experience, by motivating readers at the moment they’ve read the book, and pushing high-value content directly at other consumers.

(Social patterns do not need to be two-way. Twitter has established the convention that people can “follow” others without the expectation of being “friended” back.  So while I might “friend” people I know, I may also want to “follow” the reading habits of favorite authors, or books promoted on The Daily Show, or books disproportionately read by people in my geographic community.)

Anti-DRM

I call this an “anti-DRM” pattern is because DRM is unnecessary here.  Libraries are full of free books and yet books are still purchased.  A lot of that is convenience.  The more convenient a service is, the more value it has.  Even if it were possible for me to grab that digital book off the device and email it to my friend for free, would I bother?  Most likely I’d forget before I ever got around to it.  My own discount is a nice bonus, but the primary motivator would be the desire to share the experience combined with negligible personal effort.

Lazy

And let’s suppose that people did send around free digital books.  If I didn’t have an e-ink reader, what would I do with them?  After I got a few freebies from friends I’d probably go buy a Kindle, and then that seductive “share this book” button would take hold.  The existence of some free books is an incentive to move up to a specialized device.  They create the necessary ecosystem and will ultimately motivate, not destroy, publishing sales.

High-volume readers are not the same demographic as high-volume music consumers.  They are older, they are well-educated, they have better things to do with their time than email free books.  (Not to mention that most readers probably know a writer; few teenagers know a rock star.)  Nearly everyone who gets a Kindle says that they make more purchases, and the current Kindle store is technologically and psychologically primitive.  To compete in a networked world, digital books need to come alive, and enlist readers to promote them.

Bookworm feature update: remember where I left off

Thursday, August 14th, 2008

Bookworm will now remember and display the last-read chapter of each book, allowing you to jump right to where you finished reading. This feature applies to both the web and mobile versions of the site:

In addition, a new setting in the Profile page allows you to configure the site to always link the book’s title in this list to the last-read page.  This is especially useful when using the mobile version.


Note that, per the ePub specification, opening a new book will always go to the initial page as defined in the ebook’s OPF file.

Comments or suggestions for improvements on this and other Bookworm features are always welcome.

Recent posts to the O’Reilly TOC blog

Wednesday, August 13th, 2008

Bookworm on the Kindle browser

On the O’Reilly Tools of Change blog recently:

  1. Processing the deep backlist at the New York Times, a report from OSCON
  2. Optimizing web content for the Kindle, using Bookworm screenshots

The latter is part of a series of Kindle articles that I’ll be putting out in the coming weeks, including those on getting inside the device’s operating system (based on Igor Skochinsky’s amazing work).

(You can also read earlier posts by me on TOC.)

I’m also happy to announce that I will be on this year’s TOC Conference program committee.  Proposals for the 2009 conference are due August 25th.

Updates to epubcheck web service

Friday, August 8th, 2008

The web-based method to validate ePub files against epubcheck has been updated in a number of ways:

  1. The underlying version of epubcheck has been upgraded to 1.0 RC
  2. The timeout for large file uploads has been increased to 5 minutes, to allow for processing of larger files
  3. A bug has been fixed which prevented the service from displaying large numbers of errors

Additionally, the tei2epub library which bundles epubcheck has also been updated to include 1.0 RC.

Two likely future developments:

  1. A RESTful web service to allow documents to be validated by other programs
  2. A native Python port of epubcheck, to allow it to be directly embedded in other Python applications rather than calling out to Java