Why doesn't gmail know how many emails I have?

If I run a search in gmail for emails from 2006, on the first page of results it will say “1-20 of about 80.”

First, “about” 80? Why not an exact number?

If I delete that first page of emails, it will say “1-20 of about 80” still. Why?

If I click on the “older” link, I can scroll through multiple pages of emails until it is finally revealed that I have 282 emails from 2006. Why couldn’t it tell me that up front?

If I click back to go to the first page, the count reverts to “1-20 of about 80.”

All of my searches work like this. I don’t get it.

The google web search results are the same, and the estimates are sometimes way off (but you can only really see that for searches that return a hand full of results*).

AFAICT, the reason the results are estimated is that the mechanism google uses for searches is based on MapReduce. MapReduce is interesting since any tasks built on it can be automatically distributed over many machines, but it also has some limitations, one being that it’s much quicker to return a subset of results than it is to count all the results.

Related: as you may have noted, you only get your web search results in pages, and with only 10 links to further pages of results (for a total of 100 results). That is probably because the system uses the results it’s already found to quickly find the “nearby” ones.

I don’t know the ins and outs of Google, but that’s more or less how some other MapReduce type systems work.

  • ETA: I did some checking just now, and it seems that overly-optimistic estimates are quickly scaled down if you work your way to the last page and then search the same term again.

I knew there had to be a techy reason for it. Thanks!