The Straight Dope

Go Back   Straight Dope Message Board > Main > General Questions

Reply
 
Thread Tools Display Modes
  #1  
Old 04-24-2003, 12:47 AM
Fuel Fuel is offline
Guest
 
Join Date: Jul 2002
What percentage of the internet does Google search?

What about Yahoo and the others?

Does it vary according to the search topic?

If I were to type in a random word to Google, what percentage of all the internet's occurrences of that word would show up on the results?
Reply With Quote
Advertisements  
  #2  
Old 04-24-2003, 01:10 AM
Dolomite21 Dolomite21 is offline
Guest
 
Join Date: Mar 2003
You can't pinpoint exactly how the percentage but here is an interesting site...is not all that up to date but you get the idea.
Reply With Quote
  #3  
Old 04-24-2003, 02:09 AM
Derleth Derleth is offline
Guest
 
Join Date: Apr 2000
We have no idea how big the Internet is. None. We can make guesses, but we cannot actually count every web page, Usenet post, IRC message, et cetera that comprise the Internet, in all its Awful Glory.

Remember, kiddies: The Internet is a lot bigger than just the World Wide Web. It comprises a double-handful of different protocols, all built on the dirt-simple TCP/IP protocols, each of which can shuttle around massive amounts of data between machines. The HyperText Transfer Protocol (http) the Web is based on is one of many. Some of those protocols, like ssh, are encrypted, while others, like NNTP (the NetNews Transfer Protocol used in Usenet), are wide-open to the world.

So, we have no clue how much of the Internet Google can see. But I'm sure you can get a few wild guesses.
__________________
"Ridicule is the only weapon that can be used against unintelligible propositions. Ideas must be distinct before reason can act upon them."
If you don't stop to analyze the snot spray, you are missing that which is best in life. - Miller
I'm not sure why this is, but I actually find this idea grosser than cannibalism. - Excalibre, after reading one of my surefire million-seller business plans.
Reply With Quote
  #4  
Old 04-24-2003, 02:45 AM
mudcrutch mudcrutch is offline
Guest
 
Join Date: May 2002
Well, as I am writing this, Google claims to search 3,083,324,652 web pages. This figure comes from the bottom of the page at google.com

You can draw several different figures from that and the ones from the link Dolomite21 provided, depending on whose estimates and what definitions you are using. (link fixed, BTW) So your guess is as good as mine. In reality, though, Google only searches a small percentage of the entire internet.

Mind boggling stuff.
Reply With Quote
  #5  
Old 04-24-2003, 07:04 AM
micco micco is offline
Guest
 
Join Date: Apr 2001
Quote:
Originally posted by mudcrutch
Well, as I am writing this, Google claims to search 3,083,324,652 web pages.
Which, as Derleth points out is a fraction of the Internet and probably a fraction of what Google actually indexes. They have an enormous index of Usenet posts. Do they include that number in their "web page" count? Judging by the number, I'd guess not.
Reply With Quote
  #6  
Old 04-24-2003, 07:15 AM
Derleth Derleth is offline
Guest
 
Join Date: Apr 2000
I'd imagine Google archives a sizeable amount of Usenet's history, seeing as how it inherited DejaNews's archives all the way back to the 1980s. Add to that impressive store the thousands (at a guess) of new messages every day and pretty soon you begin to wonder how many freaking disk drives they must have...

Sorry.

By the way, Google does not archive all of Usenet. It completely ignores the newsgroups dedicated to promulgating binary files, such as images and programs. The text-only Usenet is huge, but it isn't everything anymore.
__________________
"Ridicule is the only weapon that can be used against unintelligible propositions. Ideas must be distinct before reason can act upon them."
If you don't stop to analyze the snot spray, you are missing that which is best in life. - Miller
I'm not sure why this is, but I actually find this idea grosser than cannibalism. - Excalibre, after reading one of my surefire million-seller business plans.
Reply With Quote
  #7  
Old 04-24-2003, 10:02 AM
CookingWithGas CookingWithGas is offline
Charter Member
 
Join Date: Mar 1999
Location: Tysons Corner, VA, USA
Posts: 9,775
A thread a while back discussed the "deep Internet" or "deep web" (google on them). Even if we confine the discussion to http pages (not the entire Internet), there is still a lot of pages that search engines won't see because they're dynamic. Things like Java programs that generate a price quote upon specific demand, sites you have to log in to to see, etc.
Reply With Quote
  #8  
Old 04-24-2003, 11:14 AM
Fuel Fuel is offline
Guest
 
Join Date: Jul 2002
Interesting. I knew nothing about this subject, and I figured that Google only browsed a small amount of the internet. But I had no idea that the size of the internet was such a mysterious, unanswerable question. The Java sites, Usenet and the log-in sites were a good point.


dolomite, your link does not work.

Side question: If we were to try and guess the size of the internet, how would we quantify our guess? In terms of web sites or pages or number of characters, ect.?

I'll search for the deep web when I get a chance.

Thanks.
Reply With Quote
  #9  
Old 04-24-2003, 01:44 PM
Derleth Derleth is offline
Guest
 
Join Date: Apr 2000
Fuel, I'd try to count total number of bytes accessable to the average person. This would include all data available on all publically-accessable machines, including the data transferred on chat networks and Usenet. Remember, again, that the Internet is much, much larger than just the World Wide Web, so counting web pages is only counting a fraction of the Internet's total size.

Not that we'd have a chance in hell of coming up with a good figure, but that's how I'd attempt to quantify my results.
__________________
"Ridicule is the only weapon that can be used against unintelligible propositions. Ideas must be distinct before reason can act upon them."
If you don't stop to analyze the snot spray, you are missing that which is best in life. - Miller
I'm not sure why this is, but I actually find this idea grosser than cannibalism. - Excalibre, after reading one of my surefire million-seller business plans.
Reply With Quote
  #10  
Old 04-24-2003, 02:18 PM
panamajack panamajack is offline
Guest
 
Join Date: Apr 2000
Wow; if the 'facts and figures' article linked to (type it in, or use mudcrutch's correction) is correct, and Google's web page statistics are correct, Google searches less than 0.00000008% of even web pages!


That, or they don't know how to label a chart properly. . .
Reply With Quote
  #11  
Old 04-24-2003, 03:44 PM
pulykamell pulykamell is online now
Charter Member
 
Join Date: May 2000
Location: SW Side, Chicago
Posts: 30,967
Ah... took me a while to figure out just what the heck you were talking about panama. Nice catch.
Reply With Quote
  #12  
Old 04-24-2003, 03:56 PM
neutron star neutron star is offline
Guest
 
Join Date: Feb 2000
Quote:
Originally posted by Derleth

By the way, Google does not archive all of Usenet. It completely ignores the newsgroups dedicated to promulgating binary files, such as images and programs. The text-only Usenet is huge, but it isn't everything anymore.

Not only that, but many users prevent their posts from being archived by setting the X-No Archive option to Yes.
Reply With Quote
Reply



Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 03:47 AM.


Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2014, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@chicagoreader.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Publishers - interested in subscribing to the Straight Dope?
Write to: sdsubscriptions@chicagoreader.com.

Copyright 2013 Sun-Times Media, LLC.