Right - you've spent some time wandering around the Internet in a pretty
aimless fashion, and you've got bored with looking at people's home pages, and
you've had enough of looking for popgroups or
football teams; its time to start doing
some work.
Whats the best place to start? There are a large number of
different search engines out there on the Internet, and you need to know what
they do, how they do it and which is the best for the purpose you have in mind.
There are essentially five types of search engine which are available to
you.
Each of these works very differently, and you need to be aware of when its best to use one, and when best to use another.
All search engines have what are called 'robots' or 'spiders' which spend their time going from link to link across the Internet. When they find a new site, or an updated site, they will copy some information about the site back to their home database. It is this database which is interrogated when you run a search. People can register their web pages with search engines, which means that they usually get listed much more quickly than waiting for the spiders to come across them.
Generally speaking, yes they are. They make their money in one of two ways; either to promote software/hardware such as with Altavista, or they will sell advertising space on their systems, which you'll get to see when you go to the search page. One or two search engines do charge for their services, but personally I see little value in paying money to search the Internet when it can be done for free.
If you are used to online database searching, or using CD-ROM products,
you'll find that these engines are an annoying mix of the very skillful and the
very primative. Altavista will, for example, search through a very large
database in a matter of a few seconds, and while it is reasonably sophisticated
does not stand up well in comparison to some of the advanced features offered
by CD-ROM publishers.
Don't forget that there is more to searching the
Internet than just looking at WWW pages. An increasing number of search engines
will allow you to use their facilities to search newsgroups, or perhaps
people's email addresses.
The databases which the search engines use may
well be out of date; new material may have been added, other material may have
been deleted. You will only discover this when you actually click to go to a
particular site.
If you don't find what you want from one, try using
another.
Use a variety of different engines, appropriate to your needs. If
for example you just want information on UK or European resources, it makes
more sense to go to a search engine which focuses on that region, rather than
use a search engine which is global in approach.
There are a great many
search engines, but for a list of those which I find particularly useful,
please visit my Search Engines
page. (URL http://www.venus.co.uk/philb/engines.htm)
| Altavista | Ask Jeeves | Lycos | Webcrawler | Hotbot | Northern Light |
||
| WWW | YES | YES | YES | YES | YES | YES | YES |
| Usenet | YES | YES | NO | NO | NO | NO | NO |
| URL | YES | YES | NO | YES | YES | NO | YES |
| Languages? | 25 | 26 | NO | 25 | NO | 9 | 5 |
| Images? | YES | YES | NO | YES | NO | YES | NO |
| News? | YES | NO | NO | YES | YES | NO | YES |
| Audio/Visual files? | YES | NO | NO | YES | NO | YES | NO |
| .pdf files? | NO | YES | NO | NO | NO | NO | NO |
| Boolean | YES | NO | n/a | YES | YES | YES | YES |
| Proximity | YES | NO | n/a | NO | YES | NO | NO |
| Wildcards | YES | YES | n/a | YES | NO | NO | NO |
| Implied OR | YES | NO | n/a | NO | YES | YES | YES |
| Search fields | YES | YES | n/a | YES | YES | YES | YES |
| Capitalisation? | NO | NO | n/a | NO | NO | NO | NO |
| Altavista | Ask Jeeves | Lycos | WebCrawler | Hotbot | Northern Light |
||
| Truncation | YES | YES | n/a | YES | NO | YES | YES |
| Phrase Search | YES | YES | YES | YES | YES | YES | YES |
| Relevence Rank | YES | YES | n/a | YES | YES | YES | YES |
| Preview Doc. size | NO | YES | NO | NO | NO | YES | NO |
| Date doc updated | NO | NO | NO | NO | NO | YES | YES |
| Summary of doc | YES | YES | YES | YES | YES | YES | YES |
| Refine searches? | YES | NO | YES | NO | NO | YES | YES |
| Group results? | YES | NO | NO | NO | NO | NO | YES |
| Frequency of database update | 1 day - 1 month |
1 day - 1 month |
n/a | 2-3 weeks | n/a | n/a | N/A |
| Indexed pages | 423,000,000 | 1,346,966,000 | n/a | n/a | n/a | 160,000,000 | 321,797,381 |
| Portal? | YES | NO | NO | YES | YES | NO | NO |
| Geographic Specific? |
Yes | YES | YES | YES | NO | YES | YES |
| "Phil Bradley" | 1,266 | 2,800 | n/a | 1,890 | 2,612,931 | 464 | 1,508 |
| Everton | 153,938 | 381,000 | n/a | 205,356 | 9,245 | 1,000+ | 124,799 |
| Internet | 93,292,047 | 80,900,00 | n/a | 56,267,784 | 24,085,435 | 1,000+ | 44,963,506 |
Having compared the different search engines I thought it might be a bit of fun to rank them. This is very unscientific (I mainly added up the Yes's and took account of updating and the size of the database), but I think its reasonably accurate and reflects my own experiences. So, for what its worth, here's the league table, with points assigned to give you an indication of how they match up. It's not really a fair test, and because Ask Jeeves is such a different type of engine it did badly through no fault of its own, but then I never said the test was scientific!
| 27 | |
| Altavista | 26 |
| Lycos | 21 |
| Northern Light | 17 |
| Hotbot | 16 |
| Webcrawler | 10 |
| Ask Jeeves | 5 |
If you're interested, you can compare the current ranking with the previous version of this article written a few years ago.
How does this work?
Search engines will take a number of
different things into account when assigning the relevence of a particular web
page in conjunction with your search criteria. These are as follows:
Now you've had a chance to look at some single search engines, you might want to have a look at some multi-search engines by going to http://www.philb.com/msengine.htm
Back to Phil Bradley's home page.