Alltheweb.com vs Google

It appears that alltheweb is catching up to Google in the number of webpages it searches. Is the web still growing at the same rate each year? Is the “deep web” growing at a greater rate than the “visible” web?

Let’s try a little comparison of hit counts:



Search term: cruithne  polyphenol  coacervate  angiotensin   noetherian  "ray davies"

Google        7,720     28,700       1,590       420,000       17,100      34,100
Alltheweb     7,776     52,668       1,219       180,353       10,690      80,357


There are quite a few pages that one engine hits and the other misses. So both search sites are still missing big hunks of the visible web.
These folks ( http://www.completeplanet.com/index.asp ) claim in their FAQs that “The deep Web is the fastest growing category of new information on the Internet. All signs point to the deep Web as the dominant paradigm for the next-generation Internet.”
-It might even be true.

Raw numbers aren’t as impressive as a coherent ranking system based on relevance beyond mere keyword counts. After all, even a small search will return more than a thousand hits, too many for someone to manually search through in a few minutes. I think Google, with its PageRank system, is still the winner here, despite the tactics that can be used to confound it.

So the OP’s question points to alltheweb as possibly the best in one rubric, I think the OP asked the wrong question.

Absolutely, but the hit counts give a relative measure of the size and emphasis of the two databases. Over the terms I tested, there’s no huge disparity in total hits, but there does seem to be a difference in emphasis.

Squink, I have never heard about the “deep web” before; what´s that about?

The deep web is often used to refer to the vast quantities of data effectively hidden from most search engines. Data held in databases, different document formats and so on - there’s a good overview at this site.

The winner is Google!

Search term: (“polyphenol” and “cruinthne”) or (“coacervate” and “noetherian”) or (“angiotensin” and “ray davies”)

Alltheweb: 6
Google: huh? what’s boolean?