ResearchBuzz!
ResearchBuzz Logo
Search Engine News and More Since 1998

Sign up for ResearchBuzz FREE every week by e-mail.

Email address: Privacy Policy

ResearchBuzz:

Get a Feed:



    Add to Google
    Subscribe in Bloglines

Search:

 
Web www.researchbuzz.org

September 06, 2003

The Domain Census

Okay, what is the domain census? It searches for the percentage of pages in seven TLDs (biz, net, org, info, gov, com, and edu) which contain a given keyword. Why? I dunno why. I just thought of it after thinking about Rael Dornfest's Googleshare idea and seeing something on WebmasterWorld about whether or not Google indexes the less- popular TLDs (they do.)

Query:


Please use your Google API key here if you have one!


Please use your own key! Every time this script runs it burns 14 key-uses. So please use your own key. If you get a blank page or an error when you run this hack, the key's probably run out. If you get straight 0% results and the response that domain:fred has the highest percentage, you searched on a "stop word." put a + in front of it.

This hack is not meant to be scientific, only amusing.
Some notes:

1 . The program gets the first number -- the total number of pages for a given TLD -- by running the query site:$tld -sghagwegxx. This is inaccurate, but it's the only way I know to get a number (well, you can do site:$tld inurl:$tld, but it provides much lower numbers and doesn't work for com and edu domains.) That's why this is amusing and not scientific. :->

The program then searches for the query term in the same domain ($query site:$tld) then calculates the percentage ($querytermnumber / $firstnumber * 100) and Bjorn Stronginthearm's your uncle.

2. If you run the search for queries that have a couple of results for a domain, you'll get odd-looking result percentages, like this:

1.17302052785924e-05%

The final calculation that determines the TLD with the highest percentage still works, though.

3. Cultural queries are more fun. For example, 0.000178372352285396% of gov pages returned by this search method contain the phrase "all your base are belong to us." Some words are clearly domain-specific; 7.75956284153005% of edu pages returned by this search method contain the word syllabus.

I'm still goofing around with this. I think I'm going to add an option to search English- speaking TLD's (us, uk, ca, etc.) and some sets of worldwide TLDs. Send me feedback if something doesn't work.

Posted to Google Hacks | TrackBack


Things You Can Do With This Article: