ResearchBuzz!
ResearchBuzz Logo
Search Engine News and More Since 1998

Sign up for ResearchBuzz FREE every week by e-mail.

Email address: Privacy Policy

ResearchBuzz:

Get a Feed:



    Add to Google
    Subscribe in Bloglines

Search:

 
Web www.researchbuzz.org

October 03, 2005

Yahoo Announces Open Content Alliance

Yahoo has teamed up with several companies including Adobe, the Internet Archive, National Archives (UK), Hewlett Packard Labs, and O'Reilly Media to form the Open Content Alliance ( http://opencontentalliance.org/ ), which is described as "a global consortium focused on providing open access to content while respecting the rights of copyright holders."

I talked to Yahoo about this on the phone last week. And normally I am a terrible person to give a phone briefing to because I don't say anything. I just sit there. Sometimes I have questions but usually I just want to play with the resource a while. Then I go away and digest everything in my stomach and come back and write down my impressions.

However this time I got a two-hours' advance notice of the topic. My first reaction was "KEWL!" My second reaction was, "So what took them so long?" And by the time the actual briefing rolled around I had so many questions it was hard for me to be polite. I talked to Aaron Ferstman and David Mandelbrot, VP of Search Content, who were very patient with me spitting out questions as fast as they popped into my head.

Okay, so it's a global consortium of blah de blah. What exactly does that mean? It means that the companies have come together to create a large archive of both public domain and copyrighted work. However, the copyrighted work will only be made available on the site with the permission of the copyright holder.

From the press release: "In order to respect the rights of copyright holders, content under copyright will be made available through the OCA only with the copyright holders authorization. At the option of the copyright holder, copyrighted content may be distributed through a Creative Commons license. Creative Commons is a non-profit organization whose licensing encourages personal use, reuse and repurposing of digital content. Content that is made available on the OCA website will be available in PDF and other widely adopted formats. This approach enables mass media and independent publishers to expand their reach by submitting content that spans categories, file formats and languages while retaining their copyrights."

The first project will be funded by Yahoo. Yahoo will fund the digitization and hosting of works from the University of California (they didn't give a dollar amount, just said it wasn't a particularly large or small amount of money). The first books will be online in the next couple of months, it is expected, with more being available by the end of the year.

My first question was "What about all the public domain stuff that's already out there?" The home economics archive from Cornell? The Oak Knoll Press collection? The Math Book collection? And probably lots of other ones I don't know about? David Mandelbrot said, "Part of this announcement is to encourage other organizations to participate in the OCA's efforts. We're trying to make a large collection available. We want to a variety of content and make it available in a way that's easily spiderable by search engines." Part of that effort is going to educating archivists and other collection-keepers about digitizing and making content available online in this more open, visible way.

All kinds of content? I asked. Yes, they said. Even corporate archives? Yes, though the emphasis would be on culturally-important corporate archives, commercials and so forth. Even multimedia? Absolutely. Even newspapers and periodicals? Yes. Even small and self-publishers, like those who might use the services of Lulu.com? Yes, said Mandelbrot. "We want to work with all kinds of commercial publishers; initially we're working with O'Reilly Media. But this project is open to all kinds of commercial publishers who want to participate, both small and large."

Yahoo's going to handle the search structure of this new and hopefully giant archive, so naturally I wondered how far it would be integrated into Yahoo's other search properties. For example, would the books and collections digitized by the OCA effort make it into Yahoo's directory? Unfortunately, the answer to that is "not initially." Eventually the books, collections, and institutional subsets that make up the OCA will be part of Yahoo's directory, but not right off the bat. Would there be access to this collection via Yahoo's other efforts like Y!Q and the Yahoo API? "I don't know if we've explored that far down the line yet," said Mandelbrot. "Our main focus is just getting the materials online and making them available more generally."

I can see and appreciate that, though from a searcher's standpoint I want 'em to hurry up. I mean, they've announced a huge archive of credible content from credible institutions. Of course I want access to as many tools as possible to access and distribute the information!

There WILL be a separate page on Yahoo to search the subset of content generated by the OCA. It's too early to tell how much you'll be able to break it down, whether you'll be able to create sets of books, etc. It depends on how much is indexed and what kind of material is indexed.

There is no goal in that regard. "We don't have a specific number goal; we're focused on quality," said Mandelbrot. "Our general goal is to make this the richest library of cultural materials available online."

It seems like some people are seeing this announcement as a slap at Google and stopping there, but I think there's much more to it. While there's no denying that the Search Engine Wars are back (whoopee!) and Yahoo and Google are not for an instant going to forget each other's existence, and the press release does exude the faintest whiff of Eau De Meowwwww, those are all small facets. This project is breathtaking in its scope. No matter what happens with the Google Print project, the OCA is building something complementary that could be tremendously useful without, perhaps, detracting from Google Print at all.

But it's early in the game. We have at the moment one announcement, many companies, lots of breathless quotes, and zero content. Let's see where they are with the momentum -- and how many publishers have climbed on board -- at the end of the year.

Posted to Search Engines-Yahoo | TrackBack


Things You Can Do With This Article: