1. FAQ
Details on:
|
How To Use Web Search Engines How to get the most from *search engines like *AltaVista, *Infoseek, *Excite, *Webcrawler, *Lycos, HotBot, *Open Text and the *Yahoo Directory. (starred links are Hyper-Live connections to our Zoom-Inform demo, which provides in-depth information about companies, products and technical terms) Page 6--In-Depth Analysis of Popular Search Engines Spidap Tidbits--Did you know this? When AltaVista tried to register its URL, there already was a company on the Web using altavista.com, forcing AltaVista to use the URL altavista.digital.com. As the popularity of AltaVista grew, the original altavista.com site (which designs web sites) starting getting a lot of hits from people looking for the search engine. So they added AltaVista's search form to their page, and started selling ads! Now the real AltaVista is suing! Alta Vista is a fast, powerful search engine with enough bells and whistles to do an extremely complex search, but first you have to master all its options. If you're serious about Web searching, however, mastering Alta Vista is a wise policy. Type of search: Keyword Search options: Simple or Advanced search Domains searched: Web, Usenet Search refining: Boolean "AND," "OR" and "NOT," plus the proximal locator "NEAR." Allows wildcards and "backwards" searching (i.e., you can find all the other web sites that link to another page). You can decide how search terms should be weighed, and where in the document to look for them. Powerful search refining tools, and the more refining you do, the better your results are. Relevance ranking: Ranks according to how many of your search terms a page contains, where in the document, and how close to one another the search terms are. Results presented as: First several lines of document. "Detailed" summaries don't appear any more detailed than "standard" ones. User interface: Reasonably good (nice mountains!), but not very friendly to the casual user. Advanced query now allows you to further refine your search at the end of each results page. Help files: Complete, but confusing. Too much thrown at you at once. More clarity and more explanation of options would be appreciated! Good points: Fast searches, capitalization and proper nouns recognized, largest database; finds things others don't. Alta Vista searches both the Web and Usenet. It will search on both words and on phrases, including names and titles. You can even search to discover how many people have linked their site to yours. Bad points: Some curious relevancy rankings, especially on Simple search. Overall Rating: A- Spidap Tidbits--Did you know this? America Online has just made a deal with Excite, giving AOL a share in the company and making Excite AOL's partner and official search engine. Excite bills itself as the "intelligent" search engine because of its concept-based indexing. While "intelligent" is an exaggeration (the apparent intelligence comes from the clever use of statistics, not from a sudden advance in articifial intelligence), Excite is one of our favorite search tools. Type of search: Both concept and keyword Open Text is now touting its intranet search abilities, since it's obviously been outclassed on the internet. Type of search: Keyword Search options: Simple, Power Domains searched: Web, Usenet Search refining: Word or phrase only on Simple Search. More flexible on power search, including Boolean terms and proximal locators (near, followed by). Uses a full-text index--every word on every page. Recognizes proper nouns and phrases with stop words like "a" "an," and "the." Relevance ranking: Numerical rankings, derivation unclear. Results presented as: Short summaries based on the first hundred words in the document. User interface: Reasonably easy to use, with pull-down forms to help you make your choices. Open Text now includes a Japanese search form, if you want to use that language for your Web searches. Help files: Despite a re-design, the Open Text site lacks the excellent help files that were previously posted there. Current help files are denser and more technical--intimidating to the casual user. Good points: Open Text allows you to search on phrases of any length, and to specify whether you want URLs searched or titles, summaries or entire documents. You can view results with your search term highlighted in bold text in the returned documents. Bad points: Despite having full text indexing, Open Text doesn't necessarily figure out the relevant parts of the page. The database isn't as up-to-date or as complete as most of Open Text's competitors. Overall Rating: C- Type of search: Keyword Search options: Simple, but powerful (see comments below). Infoseek now uses the Ultraseek engine, which really zips along. The site has added an extensive catalogue section for subject-oriented searching. You can also cross-reference your search terms with similar catalogue subject items and searches come back with subjects automatically appended. You can also search images, which seems to be popular suddenly. Domains searched: Web, Usenet, Usenet FAQs, Reviews, Topics. Search refining: Phrases, capitalization, no Boolean operators, but uses + and - instead (similar to AND and NOT). Relevance ranking: Gives numerical scores based on frequency and comparison to words already in their database. Results presented as: First 30-100 words of the page User interface: Good, easy to use, clear. Infoseek is also now allowing free searches of some of its extensive databases (stock quotes, company information, e-mail addresses, various reference works like dictionaries and zip code directories). Help files: Good, useful. Good points: Fast, flexible, reliable searching. Good output, which gives the URL, the size of the document and the relevancy score. Allows you to see similar pages (based on topic information about the pages). Full-text indexing, allows capital letters and phrases. Bad points: We're sure Infoseek has some bad points, but we really can't think of any offhand! Overall Rating: A- Type of search: Keyword, but Lycos is gradually becoming less of a search engine, it seems, and more of a Yahoo-like subject index. Has recently had a cool graphical facelift. Proud of its new ability to search on image and sound files. Search options: Basic or compound Domains searched: Web, Gopher and FTP sites Search refining : "Any," "Or," or "All" terms can be matched, in addition to the rather unclear "loose match, fair match, good match, strong match, or close match." (Huh??) Relevance ranking: Provided on all searches, with a relevancy percentile. Results presented as: First 100 or so words in simple search, you choose in advanced search--summary, full results or short version. User interface: Good--a little confusing on the refined search options. Help files: Good, informative, graphical help screens are easy to understand. Good points: Large database. Comprehensive results given--i.e., the date of the document, its size, etc. Lycos indexes the frequency with which documents are linked to by other documents to make sure the most popular web sites are found and indexed before the less popular ones. Bad points: Not enough options to refine the search. No complex Boolean operators (this is really a severe limitation, especially considering the lengths to which most of Lycos' competitors go on search refinement). Cannot search on phrases or on capitalization. Overall Rating: B- Spidap Tidbits--Did You Know This? AOL owns Webcrawler, but AOL's new deal with Excite means that the Webcrawler search engine and directory will be incorporated into Excite. Type of search: Keyword Search options: Simple, refined Search options: Domains searched: Web, Usenet Search refining : Uses either "and" or "any." Webcrawler has added full Boolean search term capability, including AND, OR, AND NOT, ADJ, (adjacent) and NEAR. Relevance ranking: Yes--frequency calculated--computes the total number of times your keywords appear in the document and divides it by the total number of words in the document. Webcrawler returns surprisingly relevant results. Results presented as: lists of hyperlinks or summaries, as the user chooses. User interface: Good--easy and fun to use Help files: Useful tips and FAQ. Good points: Easy to use. Popular on the Web because it belongs to AOL and there are a lot of websurfers who sign on from AOL. Publishes usage statistics on their site. Also provides a service by which you can check to see whether a particular URL is in their index, and, if so, when it was last visited by their "spider." There is also some fascinating information about how Webcrawler's search strategy works. Bad points: Speed seems to be slowing down a little recently. Its previous weakness--no way to refine search--has been eliminated with the addition of Boolean operators. Overall Rating: B-
Type of search: Keyword Search options: Simple, Modified, Expert Domains searched: Web Search refining: Multiple types, including by phrase, person and Boolean-like choices in pull-down boxes. No proximal operators at present. In Expert searches you can search by date and even by different media types (Java, Javascript, Shockwave, VRML, etc). Relevance ranking: Yes. Methods used--search terms in the title will be ranked higher search terms in the text. Frequency also counts, and will result in higher rankings when search terms appears more frequently in short documents than when they appear frequently in very long documents. (This sounds sensible and useful). Results presented as: Relevancy score and URL User interface: Very cool and lively. Some users have complained about the bright green background, but we kinda like it. Help files: A FAQ that answers users' questions, but not a lot of serious help files. Good points: Claims to be fast because of the use of parallel processing, which distributes the load of queries as well as the database over several work stations. Bad points: Some limitations still on Boolean operators, and the help files still aren't very good. Overall Rating: B Although not precisely a search engine site, Yahoo is an important Web resource. It works as an hierarchical subject index, allowing you to drill down from the general to the specific. Yahoo is an attempt to organize and catalogue the Web. Yahoo also has search capabilities. You can search the Yahoo index (note: when you do this you are not searching the entire Web). If your query gets no hits in this manner, Yahoo offers you the option of searching the Alta Vista, which does search the entire Web. Yahoo will also automatically feed your query into the other major search engine sites if you so desire. Thus, Yahoo has the capacity to act as a kind of meta-search engine. Type of search: Keyword Search options: Simple, Advanced Domains searched: Yahoo's index, Usenet, E-mail addresses. Yahoo searches titles, URLs and the brief comments or descriptions of the Web sites Yahoo indexes. Search refining: Boolean AND and OR. Yahoo is case insensitive. Relevance ranking: Since Yahoo returns relatively few hits (it will never return more than 100), it's not clear how results are ranked. Results presented as: Yahoo tells you the category where a hit is found, then gives you a two-line description of the site. User interface: Excellent, easy-to-use Help files: Not very complete, but since there aren't a lot of search options, detailed help files are not necessary. Good points: Easy-to-navigate subject catalogue. If you know what you want to find, Yahoo should be your first stop on the Web. Bad points: Only a small portion of the Web has actually been catalogued by Yahoo. Overall rating: A (This rating refers simply to Yahoo's quality as a directory--searches of the entire Web are not possible). The Spider's Apprentice was conceived and written by Linda Barlow, who maintains this site for Monash Information Services. Copyright 1996-7. All rights reserved. Updated: 2/12/97 |