ResearchBuzz
ResearchBuzz Logo
Search Engine News and More Since 1998

Sign up to receive ResearchBuzz FREE every week by e-mail. Your information is NEVER shared with anyone.
Email address:
Privacy Policy

ResearchBuzz:

 

Archives

Category-Based:

 
Date-Based:

All monthly archives

 

Last Five Entries:

[an error occurred while processing this directive]

August 19, 2004

Yahoo's No-Limit Query Limit As Opposed to Google's -- So What?

I've gotten several mails about Four Things Yahoo Can Do That Google Can't. Many people asked about the query limit, mostly two things:

1) Is there any query limit?

2) So what? Why would you want to search for so many search words at a time?

Wondering about this got my curiosity up so I started doing some experimenting. Several fizzy potions and a lot of ominous crackling noises later, I've got some conclusions.

EXPERIMENT #1: Can you search an arbitrarily large group of sites for information?

I created a search modifier designed to search the Web sites of all 50 US states (and DC too, of course.) You may remember that you can get to a state's web site by using the URL http://www.state.yy.us, and substituting the postal code of the state of your choice for yy. (DC's URL is slightly different; it's http://www.dc.gov.) The query modifier for searching all fifty states and DC is:

(site:ak.us OR site:al.us OR site:ar.us OR site:az.us OR site:ca.us OR site:co.us OR site:ct.us OR site:dc.gov OR site:de.us OR site:fl.us OR site:ga.us OR site:hi.us OR site:ia.us OR site:id.us OR site:il.us OR site:in.us OR site:ks.us OR site:ky.us OR site:la.us OR site:ma.us OR site:md.us OR site:me.us OR site:mi.us OR site:mn.us OR site:mo.us OR site:ms.us OR site:mt.us OR site:nc.us OR site:nd.us OR site:ne.us OR site:nh.us OR site:nj.us OR site:nm.us OR site:nv.us OR site:ny.us OR site:ok.us OR site:oh.us OR site:OR.us OR site:pa.us OR site:ri.us OR site:sc.us OR site:sd.us OR site:tn.us OR site:tx.us OR site:ut.us OR site:va.us OR site:vt.us OR site:wa.us OR site:wi.us OR site:wv.us OR site:wy.us)

Lessons learned with this experiment:

Lesson #1: Parens are very important in creating complex queries in Yahoo.

Lesson #2: Capitalizing special syntax makes Yahoo throw up. (Site:al.us doesn't get the same results as site:al.us.)

Lesson #3: Yahoo accepts OR as a Boolean modifier but accepts or as a search query (in other words, capitalize OR if you're trying to use it as a Boolean modifier.)

I pasted in a couple of searches. While Yahoo took a little longer than usual to spit out results, the searches worked. So I built a search form. Give it a whirl:

Searching Yahoo for Content on State Sites

Enter your query:


You'll notice when you run this search that Yahoo takes a smidge longer to generate results. Also check out the URLs that are returned. I did a little testing and it looks like all the states are included in the search.

So Yahoo can search a query of at least fifty-one terms with no problems.

EXPERIMENT #2: Can you add additional syntax?

If you've spent any time wandering around state sites you may have noticed that are some conventions for the way URLs are constructed. City sites have CI in the URL. County sites have CO in the URL.

So I checked to see if I could add an inurl: search to the query, and I could. Want to search county sites or city sites? No problem:

Searching Yahoo for Content on City or County Sites

Search The Web Sites of US States Using One Big Yahoo

Enter your query ("election board", recycling, "parks and recreation" are all possibilities):

What kind of locations do you want to search?:

Okay, so you can add another search or couple of searches here and all's well.

EXPERIMENT #3: Can I add another set of parens to my search?

CI and CO are fairly standard parts of a state government Web sites. However I'm a research nerd so I want to find the information that's available on the library parts of these sites. A few searches seem to indicate that there are no standard abbreviations for library subdirectories of Web sites, so we'll try lib or library. And we'll throw in archive for good measure. I'll add this to my original 51-item query:

(inurl:lib OR inurl:library OR inurl:archive)

This will involve a second set of parens. Let's try it out:

Delve into Archive and Library Sites for US states

Please provide a query:

Yahoo handles this search with aplomb. Actually I like this form; I get some nice results with it.

Okay, what I have learned doing these experiments? I learned you can seriously load Yahoo with site: listings without overloading it.

The kind of searching I've done here is fairly crude; I'm creating large groups to search while taking advantage of some pre-existing structure norms like co and ci. Where there aren't norms I can fake it by using OR (lib, library, or archive.) However with a little bit of effort you could use Yahoo's non-limit to do some fun stuff.

How about a custom search form that searches the world broken out by continent? How about a search of the top 50 cities by population in the US? How about a search of all the NFL franchise Web sites? How about a search of the top 50 print periodicals? How about a search that uses the inurl: syntax to isolate Movable Type weblog content at all the universities in California? (perhaps inurl:mt and inurl:cgi-bin as your search modifier?)

Obviously there's not as much power here as you'd get from, say, an API. But the lack of a low query limit makes some interesting custom form results possible.

Posted in the following categories: Search Engines-Yahoo | TrackBack
Take this title and: Google It | Yahoo It | Teoma It | Gigablast It | Amazon It

Entry Count

ResearchBuzz has 2165 entries in over 200 categories.
Search ResearchBuzz:

[an error occurred while processing this directive]

[an error occurred while processing this directive]