What percentage of the internet does Google search?

What about Yahoo and the others?

Does it vary according to the search topic?

If I were to type in a random word to Google, what percentage of all the internet’s occurrences of that word would show up on the results?

You can’t pinpoint exactly how the percentage but here is an interesting site…is not all that up to date but you get the idea.

We have no idea how big the Internet is. None. We can make guesses, but we cannot actually count every web page, Usenet post, IRC message, et cetera that comprise the Internet, in all its Awful Glory.

Remember, kiddies: The Internet is a lot bigger than just the World Wide Web. It comprises a double-handful of different protocols, all built on the dirt-simple TCP/IP protocols, each of which can shuttle around massive amounts of data between machines. The HyperText Transfer Protocol (http) the Web is based on is one of many. Some of those protocols, like ssh, are encrypted, while others, like NNTP (the NetNews Transfer Protocol used in Usenet), are wide-open to the world.

So, we have no clue how much of the Internet Google can see. But I’m sure you can get a few wild guesses.

Well, as I am writing this, Google claims to search 3,083,324,652 web pages. This figure comes from the bottom of the page at google.com

You can draw several different figures from that and the ones from the link Dolomite21 provided, depending on whose estimates and what definitions you are using. (link fixed, BTW) So your guess is as good as mine. In reality, though, Google only searches a small percentage of the entire internet.

Mind boggling stuff.

Which, as Derleth points out is a fraction of the Internet and probably a fraction of what Google actually indexes. They have an enormous index of Usenet posts. Do they include that number in their “web page” count? Judging by the number, I’d guess not.

I’d imagine Google archives a sizeable amount of Usenet’s history, seeing as how it inherited DejaNews’s archives all the way back to the 1980s. Add to that impressive store the thousands (at a guess) of new messages every day and pretty soon you begin to wonder how many freaking disk drives they must have…

Sorry. :smiley:

By the way, Google does not archive all of Usenet. It completely ignores the newsgroups dedicated to promulgating binary files, such as images and programs. The text-only Usenet is huge, but it isn’t everything anymore.

A thread a while back discussed the “deep Internet” or “deep web” (google on them). Even if we confine the discussion to http pages (not the entire Internet), there is still a lot of pages that search engines won’t see because they’re dynamic. Things like Java programs that generate a price quote upon specific demand, sites you have to log in to to see, etc.

Interesting. I knew nothing about this subject, and I figured that Google only browsed a small amount of the internet. But I had no idea that the size of the internet was such a mysterious, unanswerable question. The Java sites, Usenet and the log-in sites were a good point.
dolomite, your link does not work.

Side question: If we were to try and guess the size of the internet, how would we quantify our guess? In terms of web sites or pages or number of characters, ect.?

I’ll search for the deep web when I get a chance.

Thanks.

Fuel, I’d try to count total number of bytes accessable to the average person. This would include all data available on all publically-accessable machines, including the data transferred on chat networks and Usenet. Remember, again, that the Internet is much, much larger than just the World Wide Web, so counting web pages is only counting a fraction of the Internet’s total size.

Not that we’d have a chance in hell of coming up with a good figure, but that’s how I’d attempt to quantify my results.

Wow; if the ‘facts and figures’ article linked to (type it in, or use mudcrutch’s correction) is correct, and Google’s web page statistics are correct, Google searches less than 0.00000008% of even web pages!
That, or they don’t know how to label a chart properly. . .

Ah… took me a while to figure out just what the heck you were talking about panama. Nice catch.

Not only that, but many users prevent their posts from being archived by setting the X-No Archive option to Yes.