ResearchBuzz!
ResearchBuzz Logo
Search Engine News and More Since 1998

Sign up for ResearchBuzz FREE every week by e-mail.

Email address: Privacy Policy

ResearchBuzz:

Get a Feed:



    Add to Google
    Subscribe in Bloglines

Search:

 
Web www.researchbuzz.org

March 25, 2004

Historical ODP Data Available

The Open Directory Project is now archiving its old data. While the previous data is not complete (there are some materials from 2000 available) an ODP rep tells me that the archive will be updated regularly from now on. It's available at http://rdf.dmoz.org/rdf/archive/ .

There are three types of material here with a folder generated for each RDF update: profiles.rdf.u8.gz (contains information on editors, this file is no longer generated), structure.rdf.u8.gz (category hierarchy information), and, what most of the archive geeks will be interested in, content.rdf.u8.gz, which contains site content, links, and descriptions.

Now obviously there's historical interest to this, but my first question was, "Has someone written a Perl script that would transform the links in the content files to direct links to the Internet Archive?" The ODP rep said no, that wasn't available, but if anybody wanted to write that script or any other script that used RDF data, he was pretty sure he could get them a link from rdf.dmoz.org. So if you're doing anything fun with RDF data, drop me an e-mail and I'll hook you up.

Posted to Search Engines-DMOZ | TrackBack


Things You Can Do With This Article: