Robots and search engines

KarlGauss · September 4, 2000, 7:14pm

This is a follow-up question to the thread about which pages get hit by a web search.

I think I now understand that a robot typically scours the web for pages and compiles a ‘list’. This ‘list’ is then scanned by a search engine whenever a query is submitted. If I am not mistaken, this second step doesn’t take too long, often less than a second.

My question, then, is how long does it take the robot to tour the web? Especially if the servers it’s visiting are on slow networks, wouldn’t this take the robot ages? I guess not given that they’re out there. But dow long does it take?

Thanks.

friedo · September 4, 2000, 7:34pm

No robot has looked at the entire web because it would take a really, really, really, really, really long time. Also, the web is constantly changing. New pages are put up, old ones are taken down or moved. The content on an existing page may change every minute.

Generally a robot starts with a list of several thousand known sites and starts up a process for each site. When it finds a link, it starts a seperate process for each link until a maximum number of concurrent processes (say 500,000) are running. Since all these processes are running in parallell, it doesn’t matter if the networks are slow, some process somewhere is always going to be fetching an indexing pages.

Topic		Replies	Views
The fastest web site is google.com (1.2B web page search) Miscellaneous and Personal Stuff I Must Share	1	740	November 22, 2000
How do they do it? Factual Questions	3	776	August 21, 2004
What determines whether a given web page will be included in a Net search? Factual Questions	9	924	September 4, 2000
The wonders of Google Factual Questions	12	1468	November 6, 2002
With search broken, why not let search engines in temporarily? About This Message Board	2	777	March 21, 2008

Robots and search engines

Related topics