Page:Untangling the Web.pdf/51

From Wikisource
Jump to navigation Jump to search
This page has been proofread, but needs to be validated.

DOCID: 4046925

UNCLASSIFIED//FOR OFFICIAL USE ONLY


Search Savvy—Mastering the Art of Search


While directories and virtual libraries contain information selected by people, search engine databases are mostly unfiltered, that is, no human being is looking at the data being indexed to determine its value, authenticity, and reliability. Search engines are where the researcher's experience, knowledge, judgment, and intuition really come into play. Because of their vast scope and size, search engines are the heart and soul of Internet search and research. No other resource reaches as far or wide or quickly as a search engine. A researcher must learn to use search engines to their fullest extent despite their limitations.

Individual search engines have some very important advantages over directories, metasearch, and megasearch sites. Foremost among these is the fact that they have much larger databases of indexed sites. However, no single search engine is best. Each has its own advantages and drawbacks. Furthermore, there is a remarkable lack of overlap among search engines databases, so it is vital that you train yourself to use more than one search engine.

Greg Notess ran an interesting little experiment that demonstrated the need to use more than one search engine.[1] He was looking for the real name for an AOL screen name, a piece of information that is often hard to find. One only search engine—in his example, Yahoo, found the name—while Google, Live, Gigablast, Ask, and Exalead all failed to locate the information. It could have been any search engine, not just Yahoo, that provided the data, but the point is clear: you must try multiple search engines, especially when looking for obscure or hard to find information.

On a larger scale, the metasearch engine Dogpile touted the results of a 2005 study they did in collaboration with researchers from the University of Pittsburgh and Pennsylvania State University showing a lack of duplication in the top results of the major search engines.

"When the researchers ran 12,570 different queries through search engines at Yahoo, Google, MSN and Ask Jeeves, they found that only 1.1 percent of the results appeared on all four engines, while 84.9 percent of the top results were

  1. Greg Notess, "Overlap Showdown: Only 1 of 6," Search Engine Showdown, 28 December 2006, <http://www.searchengineshowdown.com/blog/2006/12/overlap_shutdown_only_at_1_of_1.shtml> (16 January 2007).
43
UNCLASSIFIED//FOR OFFICIAL USE ONLY
43