Archive for the ‘Reference’ Category.

W00t, There It Is: The Word of the Year

Merriam-Webster has announced the newest Word of the Year. Last year it was “truthiness” — this year it’s w00t. Yes, with two zeroes. I’m not sure if I believe in numbers as parts of words — perhaps it should be called the String of the Year.

You can get a list of the other Words of the Year at http://www.merriam-webster.com/info/07words.htm . Some of them I found surprising (”facebook”, as a verb), some were rather pedestrian (”apathetic”) and some I just want to go around saying all day long. Sardoodledom! Sardoodledom! Sardoodledom! Sardoodledom!

2007 was not a good year for my geek vocabulary, but there were two words that I thought I would see: “teh” (misspelling of “the” used in an ironic sense) and FTW (”For the Win” — “ISBN/Wikipedia lookup mashed through a Google Spreadsheet and output as an RSS feed FTW!”)

Speaking of * of the year, now’s the time of year where I like to set up one of my temporary information traps. I go to Google News or some other news search and run samples searches like “annual list” “of the year”. If you go to Google News and run that list right now, you’ll find out about the ten worst celebrity ads, the ten best engines, the most gracious autograph givers, and the top ten movies and wines. Or try “the top * of 2007″ and get linked to the top games, gadgets, and incidents of celebrities falling down.

If you want to slant your search somewhat, you can also try the “the worst * of 2007″ list as a query. Replace worst with your adjective of choice — stupidest gets you all of two results.

I’ll generally run these for four weeks just to see what I pick up, and get an idea of what trends are going on in different parts of the world. It’s a very strange crash course in pop culture and industry-specific popularity.

OCLC Hooks Up With Wikipedia

You know what WorldCat is, right? It’s a Web site that allows you to search over a billion items in over 10,000 libraries around the world. It’s at http://worldcat.org/ .

Do you know what the WorldCat xISBN service is? I didn’t either. It’s a service WorldCat offers that allows you to enter an ISBN and get related ISBNs. You access the data by building a URL. For example, say I wanted to use the xISBN service for the book Huckleberry Finn. As you might imagine there are about a zillion editions of that book available. If I take the ISBN 1580495834 and add it to an URL like this:

http://xisbn.worldcat.org/webservices/xid/isbn/1580495834?method=getEditions&format=xml

I’ll get a XML-formatted list of other ISBNs for other editions of Huckleberry Finn. Make sense? (If you don’t like the format of the result, you can change that &format=xml to &format=html. It’ll give you a plain list of ISBNs. You can also use this same service to list metadata about an ISBN or covert a 10-digit ISBN and vice-versa — see http://xisbn.worldcat.org/xisbnadmin/doc/api.htm for details.

Now the xISBN service has been hooked up with Wikipedia! That means you can enter a URL and have xISBN generate a list of related URLs, and then check those URLs against the ISBNs on Wikipedia. Here’s the base URL to use:

http://xisbn.worldcat.org/webservices/xid/isbn/ISBNGOESHERE?method=getEditions&format=xml&library=wikipedia&fl=*

Replace ISBNGOESHERE with the ISBN in which you’re interested. For example, let’s say I was looking for citations of Harry Potter and the Deathly Hallows. Here’s the URL:

http://xisbn.worldcat.org/webservices/xid/isbn/0545010225?method=getEditions&format=xml&library=wikipedia&fl=*

What you’ll get is a list of several different affiliated ISBNs on at least three different Wikipedia pages.

Now why in the world would you want to do this? It would be interesting to know if a particular book is quoted or referred to in Wikipedia for better or for worse. If I were a college student and wanted to a get a meta-view of my textbooks, I might take each ISBN and enter it in this service to see what any affiliated Wikipedia pages had to say. If I wanted to do a subject overview, I could go to Amazon, pick the ISBN of a top-seller in a fairly narrow field, and run it through this service to see what Wikipedia pages I might want to browse. (Try this with Golub and Van Loan’s Matrix Computations, ISBN 0801854148.)

You’ll notice that the URL you use to build this query uses &library=wikipedia to specify searching Wikipedia. I can’t wait to see what other libraries are added to this search!

Million Book Project Explodes a Milestone

The Million Book Project, a university-led initiative to digitize books and make them available online, will have to change its name. The project has digitized over 1.5 million books, which the project says represents 1 percent of all the world’s books (that seems kind of high to me) and 20 of the world’s languages.

The books are now available at a single Web site — www.ulib.org — which must be flooded because unfortunately it’s not responding quickly at all. Be patient and it’ll load. The very green front page invites you to do a title search (advanced searching and browsing are also available) and warns you that you’ll need the DjVu or the Tiff plugin to view books.

I did a search for irrigation. I got a 176 books returned with over 20000 pages. The books ranged from Budget Estimate Of Revenue And Expenditure Under Major Heads (from India) to The Economics Of Irrigation (from China). The book list is on the left, while the right side of the results page provides information on each book like date of publication, number of pages, table of contents, etc. Not all the information is available for all books — and not all books are completely available either. I saw a couple of “Book temporarily unavailable” messages, and at least one “15% limited access”.

(For which I am not going to overly ding them. Good grief, the project is to put a million books online, and the structure’s been put together for at least 1.5 million books. It’s an amazing amount of international cooperation and coordination. It would be like slamming the Wright brothers for not having honey-roasted peanuts at Kitty Hawk.)

Click on the title on the right side of the screen and you’ll get the book’s page in a new window. What you get depends on the site where you land. I looked at a couple books that were apparently hosted in China and had a hard time paging through them. For one of them I got a 404 error. I had the best luck with the books hosted in Egypt. (Here’s an example.) In the case of this book I could look at it via DjVu or download a PDF file. I occasionally did a site timeout, which I suspect is this site still being very busy.

You can do some advanced searching in addition to basic keyword searching at the main site. You can search by subject and language as well as hosting country and span of years. I wish that you could host by copyright status and percentage available, but I suppose you could get a certain amount of that done by searching for books copyright before 1923.

I am not going to run to this site the next time I need to do some research. I think this is more about possibilities — rapidly developing possibilities — than perfect implementation. And this is apparently not all of it — the project site has information on other projects including the Newspapers Digital Library and the Spoken Language Digital Library.