Are Google's archives a copyright violation

The search engine Google archives virtually all of the Web pages it indexes, meaning that even if the Web page is taken down by its owner, a cached copy of the Web page (a “snapshot”, as they put it) will remain available to the public on Google’s servers.

How is this not a violation of copyright?

Technically, it is. But no copyright holder has taken them to court over it, so they continue to do it. If the copyright holder contacts Google, Google probably just takes it down to avoid the hassles.

http://www.archive.org is in the same boat, and has removed pages when asked.

http://boards.straightdope.com/sdmb/showthread.php?threadid=191895

I visit a web page. The information downloaded goes to my browser’s cache. I can view that page offline later. Am I violating copyright? Of course not, if you put up something for people to download, you can’t complain when they download it.

Likewise, Google could make a case that they are just downloading and caching content like every other visitor. A reasonable position would be that there is an implied consent for a use like Google’s. Especially since I believe a web page owner can prevent Google indexing if they want by using HTML tags.

The big difference is that Google is redistributing their cache while mine sits on my hard drive, accessable only to people who use my computer. Of course my ISP’s router, or my employer’s proxy server, is also “redistributing” the web page is some sense.

And no one is really sure what that means. Wait for more legislation followed by 20 years of case law to sort it all out.

If they were then wouldn’t any archive be a copyright violation? Such a libraries?

A library holds authorized copies of materials.

I can’t let this one go by. Copyright is exactly what it sounds like: the right to copy. It is not a violation of copyright merely to archive materials that one has paid for, which is what libraries do. It would be a violation if a library kept materials that had been copied illegally, but libraries generally don’t do this.

Copyright law is complicated, and depends on precedent as well as on written law. The question of caching has not been tested in court. We can only speculate as to how the courts would rule.

An important difference between a browser cache and the Google cache is that Google makes their cache available to others. It is possible that the courts could rule that local caching is a form of fair use, while making cached copies available to others is not.

Google’s archive is comprised of authorized copies as well. Google’s spider creates a copy in exactly the same way your browser creates a copy in its cache when you view a page. Putting content on a webserver would seem to be tacit approval for creation of those working copies. If Google’s copies constitute infringement, then so does every hit by a browser and the entire web is just an instrument for infringment (except for those rare sites that post PD content).

As Jeff Lichtman points out, the difference between Google’s archive and your browser cache is that Google is redistributing. This is analogous to a library that makes photocopies of their books so nothing is ever unavailable.

IANAL and I’m not going to speculate on how a court would rule on this, but I wanted to point out that there is nothing illegitimate about Google making a copy in the first place. It’s what they do with it afterward which would bear on whether there is copyright infringement or not.

If you have a TradeMark you have to write a letter to people who are using it without permission to protect your right. But if its stored in millions of computers & you don’t write all those people, doesn’t that mean you no longer have the copyright?

Trademarks and Copyrights are completely different. Trademarks must be defended. Copyrights need not be. That means that every instance of trademark infringment can contribute to a dilution of the mark which might eventually mean the owner of the trademark is unable to protect it because it has entered common use (e.g. the owners of Band-Aid and Xerox constantly battle against their generic use to avoid losing their trademark rights). Copyrights are completely different and infringement has no bearing on continued ownership of the rights.