Worldwide Digital Libraries January 30, 2008

Along with Google Books and the Internet Archive/Open Content Alliance, there is a lesser-known collaboration between Carnegie Mellon University Libaries and three international institutions (Zhejiang University 浙江大学, the Indian Institute of Science, and the Bibliotheca Alexandrina) to put “a collection the size of a large university library” on the Web for free.

A million and a half books – for now, mostly in Chinese and English – have already been scanned and are (well, 15% of them anyway) accessible through the website at http://www.ulib.org/.

Back a few months ago, I decided to take a look at the UDL. A quick search for “pickwick” brought up 19 different records of various versions of Dickens’ The Pickwick Papers, under many different titles and authors, including one “Charles Dicknes”, one “CHARLES DICKENS”, and one “Scott Russell” – the title for the latter being rendered as, “Dickenss Posthumous Papers of the Pickwick Club (1912)”. Multi-volume copies had their titles rendered variously as “Vol I”, “Vol. Ii”, “Volume Ii”, etc. For those books that actually showed a detailed record when the title was clicked – many simply said “Book currently unavailable” – the tables of contents had been typed in haphazardly, as well. One record had every chapter title entered, many with misspellings, all in lowercase letters.

Some versions rendered the Subject as “Unknown”, while others had Subjects ranging from “Fiction” to “Language, Linguistics, Literature” to “Social Science” to (curiously enough) “Biology”. The latter example appears to be 1936 book by Logan Clendening that shows up in WorldCat as “A handbook to Pickwick Papers”, is rendered here as “A Hand Book Of Pickwick Papers” and the Publisher as “Alfred .A. Knopf”.

We visited the site again at the ASIS&T meeting on Monday the 28th. There was a prominent notice that metadata errors had come to their attention and were being cleaned up. Sure enough, the Pickwick Papers seem to be in a lot better shape now. (And to be fair, Google Books lists one “Posthumous Papers OR The Pickwick Club” in its results.)

If you visit some of these, take a moment to compare interfaces, and also look at some digitization projects hosted by individual libraries (particularly Special Collections and Rare Books). A contrast that quickly becomes apparent is the book-as-object metaphor (the British Library’s incredible Turning The Pages, for example) vs. the book-as-searchable-content metaphor (Google’s scans, which eliminate covers, endpapers, etc. in favor of focusing strictly on the page). Post your feedback here in the Comments — I’d love to know what you think.

