How does google cache work?

(FYI: I did search for an answer on google before deciding to ask this question)

From time to time I am doing a google search on some issue of extreme importance, and the site is down for the only search result that I am interested in. In these frightful moments my sanity would be in doubt, were it not for that delightful cached button google has so generously provided. Now I understand the basic idea – I’m quite aware of what caching is. But the problem is that often google spends a disconcerting amount of time “transferring data from [website X]”. This is annoying in and of itself, given that sometimes I hit the ‘cached’ button solely because I hope google hasn’t cached all of the ad/pic/flash content I am trying to avoid – I assume it will load faster – I just want the text. But what is even worse, is that sometimes, especially when I am most in need of the information I am trying to get at, google hangs while “transferring data from [website X]”. It is a as though google can’t load the cached version of the page because google can’t retrieve information from the site that was supposed to be cached. If this is the case, what is the point? Why doesn’t google just cache the data on its servers?

I do not know why getting stuff from Google’s cache can often be so slow (yes, I’ve experienced it), but there are several other caching sites that can also be used if Google does not have what you want (or does not work). I use the Firefox extension Resurrect Pages, which provides access to seven different cache sites:
[ul]
[li] CoralCDN[/li][li] Google Cache[/li][li] Yahoo! Cache[/li][li] The Internet Archive[/li][li] MSN Cache[/li][li] Gigablast[/li][li] WebCite[/li][/ul]

It can also be quite useful to use the Internet Archive Wayback Machine directly, just by typing in the URL of the site you want (I don’t whether any of the other cache sites can be utilized this way).

Diagnosing backwards from your description, my guess would be that what’s happening is that they do have a version that they have saved on their own servers. But, they only keep one version, which is whatever was there the last time they checked the page. That means that there’s some chance that you’ll browse to the cached version at the same time as it’s being updated. They might have a lock on it preventing access until the webpage has been fully downloaded, so that you don’t see only half a webpage or whatever.

Google caches the HTML only - not the images, flash and so on. They cache that because it’s valuable for search, and because it’s relatively small. They don’t need the other stuff.

It loads slowly because your browser is trying to fetch images and other content from the original server, which is likely to be down or slow (since that’s typically why you’re using the cache).

And that’s why there’s a link at the top that says “Text only Version”, which will do what the OP actually wants.

Interesting. Yes, I see the “text only” option when the cached page loads, but sometimes the cached page never loads, nor does the banner with that option. Maybe one of the other responders is right in guessing that it is a result of google’s periodic updating of the cached page. You’d think they would have a better solution though…