Archive

Posts Tagged ‘search engine’

Google Custom Search Engine Just for Hospital Web Sites

August 27th, 2009 Comments off

WOW. This must have taken some serious work. Ed Bennet has recently announced a Google Custom Search Engine (Google CSE) that saerches over 2800 US Hospital Web sites. The search engine’s available at http://ebennett.org/hospitalsearch/.

I did a search for biopsy and while the CSE did not give me a result count, I was able to page through the search results. I stopped after confirming there were at least 450 search results.

The results themselves were what you’d expect of a CSE that’s been handmade — very clean, no junk. There were results for things like “strawberry shortcake”, but those were for recipes. As a matter of fact, that’s an interesting use for this search engine. Search for a health condition and then the word recipes. For example, diabetes recipes or heart disease recipes

Kudos to Ed Bennet. Those searching for medical information — even peripheral information like healthy living ideas — will find this CSE very useful.

Categories: News Tags: , ,

New Search Engine Wants to Give You Just the FAQ

February 10th, 2009 Comments off

Want a new search engine to try out? Check out SnappyFingers, a Q&A search engine that has a data set of over 13 million questions. It limits its data pool to just FAQs. You can try it at http://www.snappyfingers.com.

And good grief, in this economy don’t we all wish we had some answers to some questions. At the moment I have a headache. I went to SnappyFingers and asked it, Why do I have a headache? The first answer was from the SLA Convention Wiki for Denver and describes headaches based on altitude. Other answers came from CPAP machine FAQs, wine companies, coffee companies, and a site at Angelfire which isn’t afraid to use profanity.

The full question and answer is reproduced in the search result, which is handy. However to get to the actual page from whence the result came you’ll need to click on the “Source” link. The “Source” link shows only the domain. I think it would be better if it showed you the entire URL. You’ll get even more context that way.

Figuring that asking about my headache was too general, I asked SnappyFingers Is Pluto a planet or not? This got me a much better and more focused set of results than the first question, which a giant answer from NASA as the first result. Beyond Pluto other questions speculated on a tenth planet, the definition of planet, whether the moon is a planet, and my personal favorite, “What are vegans, and which planet are they from?” So even this question ranged a bit afield.

I got all Tech Support on SnappyFingers and asked it several questions including “Why is my Roomba beeping?”, “How can I boot Windows faster?” and “Where’s the Any Key”? The first question had no answer, the second question had two answers (only the first one was for Windows 98), and the third question had LOTS of answers but the first one answered the question (I did not know there was a Wikipedia page for this.)

The answers to “What happens after I die?” made me do an iced-tea spit take; sober responses to questions about insurance were interspersed almost exactly equally with snarky responses about death in online games. Oddly, religion-based answers to the question were fairly rare in the group of results I looked at.

That group of results brought it home to me that there needs to be more categorization of the sources here, or failing that some really rudimentary search filters could be instituted — like limiting sources to top level domain (.edu or .uk instead of .com, for example.) Maybe visitors could participate in tagging sources? There should also be a way to keep the focus strictly on the question. When I ask “Should I get a cat or a dog?” and get an answer to “The lady I write to asked me to send her some money. What should I do?” then there is a lack of precision that is not going to make it easy for the searcher.

I like the breadth of sources that this search engine covers, and 13 million questions is a good start. But there needs to be a way to get more focused.

Categories: News Tags: , , ,

Information Trapping and Twitter

January 5th, 2009 Comments off

Ohai, I’m back.

Yeah, gone for a while. That darned meatspace. All kinds of stuff can happen and the next thing you know you haven’t put anything on your Web site in six months and people are e-mailing you asking if you’ve lost the keyboard.

But one of my 2009 resolutions was to get back here, since I like doing ResearchBuzz, I’m still crazy about search engines, and I missed ya’ll. There’s probably nobody left — who hangs around an empty RSS feed for months and months? — but if you’re still out there, I did miss you and I’m glad to be back.

I have been doing my Tech Talk thing over at WRAL, so I have been keeping up with my information trapping to a certain extent. But I had not yet delved into Twitter as a way to trap news and information about online search resources. I’ve been playing with it some this evening and wanted to share some conclusions. The Twitter search interface is available at http://search.twitter.com/; I’m Twittering at http://twitter.com/researchbuzz.

The basic Twitter search is simple keyword with the ability to use phrases, exclude words, etc. I tried a couple of sample searches and looked at the RSS feeds, and was struck first by the complete lack of overlap between Twitter and my more traditional new sources. There’s some, but far less than I expected. The second thing I noticed is that I think I’ll be excluding more words than I include; the ability to quickly post and the apparent 15-item limit for Twitter RSS feeds means you really have to work to clamp down the flow and narrow down the kinds of search results you get.

The first thing I figured out is always add -RT to your search, so you don’t get piles of retweets. You’ll still get a few but it gets a lot of noise out of your feed.

The second thing I noticed is that I can take advantage of the patterns of Twittering ‘bots. Searching Twitter for “online library” gets a lot of results from one particular ‘bot, but they’re mostly formatted in the same way so they’re easy to remove.

Third is that you can get a good idea of vocabulary even from one page of search results, and Twitter is tolerant of long queries. So if I want to get news about search engines but not necessarily SEO or rankings and placement, I’m going to have very little luck with “search engine”. I will however do much better with this:

“search engine” -rt -marketing -rankings -myths -optimisation -optimization -visibility -placement

… and even one page of those results only goes back about six hours. But do you see what I mean about excluding more words than I include?

After I’d gotten a feel for what the keyword searches could do I went and took a look at the advanced search options, available at
http://search.twitter.com/advanced. The geographic options are cool, though unfortunately not so useful in the kind of stuff I want to search for. On the other hand, the ability to limit Tweets to only those which have links is very nice (see the checkbox down at the bottom.) There is probably a way to take advantage of the emoticon search but I haven’t figured it out yet. You can also limit the Twitters to those which ask questions.

Searching Twitter is completely backwards from searching a full-text search engine, especially with an eye toward getting a usable and constant flow of information. On a regular search engine, you want to use as many search terms as possible to narrow down your results from a vast ocean of data. In Twitter, there’s still a vast ocean of data, but it’s divided into trillions of drops of water. It’s possible (and from what I’m seeing probably even a better idea) to narrow down with what you DON’T want, instead of trying to guess the right set of keywords from no more than 140 characters at a time.

I’ll be doing more experiments. In the meantime you can check out http://search.twitter.com/operators for a list of Twitter operators and special syntax.