How Do Search Engines Function – Web Crawlers & Spiders

It is the search engines that finally bring your website to your potential customer’s notice. Therefore, it is best to know how these search engines actually work and how they can present information to the customer who attempts to initiate a search.

There are two basic kinds of search engines. The first is by robots called crawlers or spiders.

Search Engines use spiders in order to index websites. When you submit your website pages to a search engine by completing their required submission page, the search engine spider will index your entire site. A ‘spider’ is an automated program which is run by the search engine system. A spider visits a web site, reads the content on the actual site, the site’s Meta tags and will also follow the links that the site connects to. The spider then returns all that information back to a central depository, where the data is indexed. It will then visit each link you have on your website and index those sites as well. Some spiders will only index a given number of pages on your site, so don’t try to create a site with 500 pages!

The spider will return to the sites periodically in order to check for any information that might have changed. The frequency with which this occurs is determined by the search engine moderators.

A spider is almost similar to a book where it contains the table of contents, the actual content and the links and references for all the websites that it can find during its search, and it might index up to a million pages each day.

Example:  Excite, Lycos, AltaVista and Google.

When you request from a search engine to locate some information, it will actually search through the index which it has created and not actually search the Web. Different search engines tend to produce different rankings because not every search engine tends to use the same algorithm to search through the indices.

One of the things that a search engine algorithm tries to scan for is the frequency and location of keywords on a web page, but it can also detect artificial keyword stuffing or spamdexing. Then the algorithms will analyze the way that pages link to other pages in the Web. Through checking how pages link to each other, an engine can both determine what a page is about, and if the keywords of the linked pages are similar to the keywords on the original page.


Martin Redford is contributing editor at WebDesignArticles.net. This article may be reproduced provided that its complete content, links and author byline are kept intact and unchanged. No additional links permitted. Hyperlinks and/or URLs must remain both human clickable and search engine spiderable.