Do ISPs cache pages—and is there anything I can do about it?

We just published a Web site for a client on a reputable Internet host (Network Solutions). Timed with the client’s press conference, we took down/deleted the pre-existing “under construction” pages and uploaded index.htm. The rest of the site had been built over the past few weeks, and lots of testing had taken place.

We have verification from people in New York (us), Scotland and Japan that they are seeing the new page and hence the entire site. However, one group in Geneva is getting mixed results—some are seeing the site, some are still seeing the generic “under construction” notice.

We’ve asked them to clear their local cache. They claim to have done so. This has happened before (different clients, different locales), and it’s cleared itself up within a day—but what a day. Adding a monkey-wrench to the works, the people with the issue are saying that they are not always getting the construction notice—sometimes they make it to the page itself. I don’t know if their bumbling idiots and never cleared their cache, nor do I know if they are getting to it from one computer but not another, or one computer sometimes does and doesn’t. I’ll feel your pain in reading this if you’ll feel mine in writing.

So … per Network Solutions, the problem rests with the visitor’s ISP. Trust in them aside (meh), what conditions would have had to occur for this to take place? The visitor viewing any page on the site or must they have gone to the root (certain sub-pages were made available pre-launch)? If it’s not their cache, is there anything we can tell them that might help? Saying tough cookies isn’t good customer service, but I don’t know what else.

Or am I completely wrong with the possible problem source?

I’m going to guess that most of the clients having issues are not actually clearing their cache correctly.

BUT, and that’s a big but, it may be a DNS issue. You can ask the users to flush their DNS (How to Flush DNS - Tech-FAQ) but that’s pretty technical.

Also, ISPs keep a cache. Instead of constantly updating their DNS cache, they do it periodically (every day or two). Sometimes this results in what you’re getting, but I don’t think it’s that because a DNS change wasn’t involved, just some file changes.

You could also rename the index file - whatever it was when under construction, say index.html - to index.html and set .htaccess to look for .html files first.

There shouldn’t be any DNS cache issues because no DNS information was changed. If these people are not seeing the new page after deleting their local browser cache, then it’s probably because they’re behind a caching HTTP proxy.

It could be in the browser setting. Not only do you have to clear cache form the broswer memeory but you have to clear it from the hard disk as well. Various browswers store things differently.

To get to the bottom, you are going to have to try it in several broswers, on several machines.

Thanks. Did you mean to say “whatever it was when under construction, say index.html - to index.htm and set .htaccess to look for .htm files first.”?

ETA: Didn’t see the other replies.

We’ve been told–and we can only believe so much–that their IT department showed up and made sure their cache was emptied.

How much faith to put in their IT guy? Either lots or none, I have no idea. The first was at a very large (and very ominous) test administration company, the second is for a very large international aid agency. Not that that means they were proficient, but that they have IT guys on staff and are not like us (e.g., I’m the IT guy in the house).

Yes, that’s what I meant. Lunch hour means I get none too smart for half an hour.

Is Network Solutions operating a cloud environment?

One other thing you can try doing is ask your remote office to try viewing the page over SSL, as this will bypass any ISP-level caching (even if you don’t have a real SSL certificate, you could set up a self-signed certificate just for testing purposes, assuming you’re using a webserver which makes this easy).

Do you have any new pages? Can you send them links to those pages to see if they actually exist?

If not, can you just toss up a new page like testTODAY.html and see if they can access it?

What about adding a query to the end of index.html and see if they can see the new index page that way? Like index.html?t=12345

Try to make sure that you’re not doing anything strange with the http headers that relate to caching. When you’re telling the browser that some part of the page can be caches, you’ll have to change the name of that resource if you want to update it, else it will be taken from cache even if there is a newer version on the server. I thought that does not apply to the main html content, but I’m not sure about that.

And I would guess that it is not the ISP that is caching. Altough a company might have a local caching proxy.

Years ago it was common for ISPs to cache pages. Now it’s quite rare except in special cases (some mobile networks compress and cache content, for example).

You mention an IT department, which suggests the users are on an office network. It’s more likely that their IT department runs some kind of proxy that is caching the content.

It’s possible that the configuration on your server is triggering the problem. Use http://redbot.org/ to check your site and see if it shows any warnings about caching headers, timestamps and so on.

As evidence as to how far out of our reach they are: no one is picking up the phone or responding to emails :rolleyes:

No, wait, I take that back. It’s 12:30 AM there.

The relation is a bit distant with this client. Think aid agency bureaucracy with donor agencies, implementation agencies, freelance researchers, etc. Someone, somewhere is sending emails up the chain that said they were getting “under construction” notices after the general launch. After checking, messages went back down telling them to be sure they cleared their cache. Messages came back up saying it didn’t help. We prudently discarded a few choice messages before asking them to double check. That’s where the IT folk come in. No contact, just another upward-bound email saying it is/isn’t working.

The page is basically brochure-ware, nothing fancy nor shmancy on it, so odd headers are unlikely. This office is part of a bureaucratic headquarters, so it’s possible that they have some archaic internal proxy.

Redbot seems like a great tool – it doesn’t look like it’s reporting any problems.

If we run into this again (assuming the morning comes and everyone can see the page fine), it sounds like there is no straightforward, easy answer. If we can empty our cache and see it, if we can throw up a new test page and see it, and if we can get to an SSL page, it sounds as if we can comfortably say it’s not anything that has to do with us and there is no quick solution to their problem. Not that we want to punt if it can be avoided, but I want to be sure I’ve covered the due diligence basis–which stops short of helping a random worker learn how to clear their own cache, etc.

Thanks!

I second the suggestion that it is probably the client site proxy giving issues. The user may not even know they are using one (proxy set via policy or port redirection). All you can do is set your pages to no-cache and hope their proxy respects the tags.


<meta http-equiv="Pragma" content="no-cache">
<meta http-equiv="expires" content="0">

Si