Archive for December 2007

Clustering Search Results With Carrot

When I think of search engines that cluster results, I don’t think of root vegetables. But I like Carrot. I like its clusters. Some of them make me laugh. You may like it too if you try it; it’s at http://demo.carrot2.org/demo-stable/main . (Carrot.org is apparently some anti-spam thing that’s launching in January 2008.)

Clustering search engines, if you don’t remember, gather search engine results into different topics instead of providing them in one huge pile. Searching for chips, for example, might give you topics for chocolate chips, poker chips, computer chips, wood chips, etc.

I searched Carrot for carrot. I got the expected Carrot Cake and Carrot Juice, but also results for Carrot Museum and Carrot Top Records, which is apparently a record label in Chicago. Only 98 results were returned, as Carrot appears to metasearch five search engines and provide the top 20 results from each. A third tab shows you the sources from Carrot’s search, and an additional tab shows the site suffixes that the results are coming from. My carrot search grabbed a surprising number of results from .uk and .gov.

I did find some of Carrot’s clusterings not particularly useful; “Easy” and “News” aren’t much in the way of topics. Carrot however does offer different clustering options. Click on the “Show Search Options” and you’ll get a dropdown menu with six different ways to cluster your search results. “Lingo” appears to be the default, but I liked STC as an option as well (it’s the last item on the drop-down menu.)

Are you less interested in HOW Carrot is clustering than WHAT it’s clustering? In addition to Web searches, Carrot also clusters Yahoo News search, Wikipedia, and PubMed, among others. I am intrigued by the ability to try different clusterings on the fly; I’m going to use this site again.

National Archives of Ireland Releases Dublin 1911 Census

I mentioned this way back in March and now it’s finally arrived! The National Archives of Ireland has released the 1911 census for Dublin, now available at http://www.census.nationalarchives.ie/ .

You can search by first/last name, approximate age (within 5 years), gender, and DED (District Electoral Division). I did a search for Smith and got over 2400 results. Results are sorted by surname by default, and are in tables which include name, townland/street, age, sex, and DED. Click on a name and you’ll get the information on the entire family living at that address. On the family page you’ll also find links to census images (which are in PDF format, be warned) that include the basic household form but also forms for buildings, outbuildings, and other forms unique to households or situations.

I am not a big fan of PDF census forms but these were pretty easy to read/magnify/manipulate. Information on these forms included the religion of the family, whether each member could read/write, marital status, where they were born, whether they spoke only “Irish” or both “Irish and English” and any conditions that the census especially took note of (”idiot”, “lunatic”, etc.)

You can also BROWSE the census, which is pretty interesting! You can browse by DED, which will take you through the townlands/streets of Dublin and list houses and families as well as significant buildings in the area. To get a sense of what the city was like in 1911, the site also has several articles and photographs covering a variety of topics including religion, transportation, poverty, and law.

Be sure to check out the future plans page, which shows the order in which census records will be made available. 1911 will be done first, then 1901. Something to look forward to!

ResearchBuzz Roundup 120907

Ask busts out its top 10 searches for 2007 (in various categories.)

Exalead for the iPhone.

Microsoft acknowledges eight-year-old exploit. Good grief.

Kevin Kelly does a brief blog post on how to know you know. It’s funny because it’s true.

Google Blogoscoped has a roundup of mission statements. My mission statement is not to fall off the pallet jack. Again. >cough< GameTap is looking for beta testers.

New Seamonkey.

Google adds more cowbell ^H^H^H^H^H^H^H^H sitelinks to search results.