Another "Why does Google do this?" question

How do I see the last hit of the search results?
We’ve all seen the huge number listed in the upper right on each page of results for pretty much any search. Just before I posted this, I typed in lost to see how many hits it found: “Results 1 - 30 of about 650,000,000 for lost [definition]. (0.23 seconds)”.
When I tried to see the results past 1000, I got this message: "Sorry, Google does not serve more than 1000 results for any query. (You asked for results starting from 1000.)
I’m having another WTF? moment.

Educated WAG, if such a thing is possible: Google doesn’t really pay much attention to the actual number of results for a search if that number is way more than any normal person would actually look at.

No real human would want 1000 results for any given query. But a human might be interested in knowing how many total hits there are, without reading all of them. So the Goog gives the useful information, but doesn’t bother with the useless information.

Thanks for calling me abnormal and unhuman, I appreciate it.
I was just curious about what those hits waaaaaaaaaay out in left field might be.
Thanks again.

[Moderating]

Thin Ice, no one called you either abnormal or unhuman. There’s no reason for you to be taking offense where none was meant.

Colibri
General Questions Moderator

Somewhat informed speculation (I’m a database professional): It would require inordinate storage and CPU overhead to actually compute and store the full ordered list of URLs for every single search term. They don’t feel it necessary to satisfy what amounts to idle curiosity about the nitty-gritty fine details of Google’s search algorithm.

*** Ponder

They also don’t feel the need to give someone* with their own software (including automated search bots) enough information to reverse-engineer their search algorithm. PageRank is proprietary and is one of the most important secret algorithms in the world. Its likely commercial value makes the value of the Coca Cola formula pale in comparison: Being able to make a somewhat more faithful coke knock off is nothing, but being able to scam Google for a good while is a guaranteed license to print money. ‘SEO’ scammers make good money with far less insight into how PageRank works.

*(Who likely would be abnormal, and might be considered inhuman by certain radical Luddite sects.)

Well, you spoiled my brilliant post that that last… *Grumble… *:stuck_out_tongue:

I’m inhumanly fast at making abnormal jokes, I suppose. :smiley:

Sorry I bothered you, Colibri. I guess I should have put a smiley in there to indicate I wasn’t offended; it was a joke. (My own family doesn’t get most of my jokes, either, but I keep trying.)
It was just idle curiosity on my part, but it sounds like Google is bragging to me.
Thanks everyone.

Agreed.

I’ve tried driving deep into Google results, sometimes with small hit numbers that make it possible to get to the end.

What I’ve found is that long before the number originally given is reached Google stops individual listings and gives the “we’ve eliminated duplicate hits for your convenience” notice.

The original number comes from some counter in their database but probably doesn’t have a one-to-one correspondence with the real world. Just as the "found in 0.28 seconds " doesn’t mean much while you sit there and wait for the first page to show. Both may be technically true in some sense, but both are also puffery for the wonderfulness of Google.

We actually know some details about how PageRank works. Google’s big secrets are the specifics of the algorithm, and how exactly they manage to search the entire web in fractions of a second.

if it makes you feel any better, I understood you.

Thanks, Cagey, I’m glad someone did. Apparently Sapo doesn’t believe me.

Naw, I think Sapo is having an unrelated dig. For what it’s worth I thought you were kidding too.

Regarding the question if you’re looking for ‘lost’ (and I know it was an example) then you probably want the definition, the translation of the word, the TV show, or some other work of fiction/art related to it. You don’t want every page on the web the uses the word lost in passing (i.e. link this thread once Google’s infinite monkeys get around to noticing it).

So while there are many many thousands of matching hits, only the first lots are likely to be of any use to anyone.

Adding more terms will generally narrow stuff down. If you’re looking for the odd stuff then start excluding the more popular hits (“lost -tv -island -pets -paradise” and see what happens.

SD

Alright, two down. I wasnt’t looking for anything, I just used the word lost by itself because I knew it would return a massive number of hits; I don’t have any trouble finding what I’m after thru Google. Thanks, SpaceDog.

That’s always a good idea.

If he was, he would be well advised to keep it out of GQ.

Whenever I’ve seen that, it always gives a link to repeat the search with all the duplicates included.