With newspapers, magazines, and printed books being supplanted by the Internet, I find myself wondering whether the maxim “the Internet is forever” is really true (or true often enough, for the right things).
We already have problems on the personal computing level with lack of backups destroying possibly historically interesting (if not that important, most likely) personal data that would in the past have been preserved (if even only a small percentage) in the form of diaries, letters and photos. (I especially think about all the threads about old photos that really give an idea of daily life in past eras.) We already have problems on the larger scale with government agencies finding old file formats unreadable.
What happens when we give up most or all printed information to the Internet? What happens when big media companies decide they don’t want to spend the space and money to archive news websites? What happens when publishers exercise their power to alter or delete e-books at their complete whim (how often will their customers have the wherewithal or desire to defeat the DRM and make backups — not sure if the lack of used book market is relevant here, but I’ll toss it in anyway)? And, of course, there are the wikis, which have their unique challenges in this regard.
As someone who is thoroughly grounded in old media, my first reaction is “How is that different from what has always been?”
Despite the fact that it was the only Super Bowl to be broadcast on two networks, there is no complete recording of Super Bowl I. There are documented works from Shakespeare, Thomas Hardy, and Ernest Hemmingway, among many others, that are lost. A fire completely or partially destroyed 22 million Army and Air Force veterans’ records. A fire destroyed nearly all the records of the Census of 1890. 40,000 negatives of the Kennedy family by photographer Jacques Lowe were destroyed in the World Trade Center attack.
UNESCO published a list of major libraries and large collections that had been destroyed in the 20th Century, either by nature or man. The list runs more than 10 pages. Frankly, we may have a better chance of preserving knowledge with the Internet.
Well, for one, printed material gives more people and institutions a chance to archive.
For another, there’s little chance a publishing house could immediately change the wording of hundreds of thousands of books, or go into anyone’s home and take away their physical copies…
Digital storage makes it far easier to provide redundancy to your archives. Even without compression, how many novels could you fit on a modern hard drive? A couple of million, on something that can be carried in one hand?
In at least a few specific instances, information on the WWW doesn’t even last as long as a recently-printed book.
There is a trend, nowadays, for printed books to have URL links in the references section. Instead of just saying, “Read XYZ’s ‘Basics of IJK’” it will have an internet link.
And there is a parallel trend for some of those links to be inoperable, even as little as three years after the publication date of the book!
So, in my opinion, internet links as a form of footnoting in printed books is an idea that isn’t really dependable. Will it get better?
(Also, there’s caching, but I’m not really sure how to use it. If a web site has gone dead, what’s the process for trying to find an older cached version?)
One way is to use the Internet Archive Wayback Machine. I actually found use for this during the federal blackout last year when I needed information from the USGS Mineral Resource Program.
The internet cannot have a monopoly on information. I would assume libraries would still exist. As long as there are enough different places where the stuff is stored then ‘big corporations’ (excluding perhaps the Church of Scientology :p) will never have the power the OP alludes to. Archiving can and is often non-profit or government run anyway.
I think you’re conflating 3 different problems, only one of which is unique to computers.
Problems with ancient formats/encoding. This isn’t unique to computers (e.g. languages are forgotten), but computers present more opportunities for it to happen. However, adhering to open standards should make recovery of old data stored in obsolete formats much easier.
Entropy. All mediums decay. The most durable one seems to be stone, but it’s a bitch to backup.
Computers actually provide easy methods to combat both of these, that are generally superior to the methods available to old media. But you actually have to use them, and none of them are perfect. I saw a server’s hard drive have an installed life span of less than a day yesterday. Even if it had backups scheduled for it, it wouldn’t live long enough to reach it’s scheduled backup time.
DRM. This is particularly pernicious, and entirely limited to the world of computers. In particular, the worst devices are ones that are able to have their contents changed at the whim of the issuer. Now, any device that can be made can be hacked into, but it really is a large barrier to archiving what you’ve got a licensed right to. It’s because of this sort of thing that I don’t normally use e-readers, and buy my records on album.
Nothing is forever. That said, it’s hard to imagine humanity reverting technologically so much that something like the internet becomes completely defunct. Oh, it will happen someday, because someday there won’t even be any more humans, but as long as we have a technological society I think we’ll have something like the internet.
Memory and storage have gotten cheaper, and companies have gotten more aware of things like mirrored sites, disaster recovery, continuity of business, etc etc. When I started in my current job they didn’t have a backup system at all (the sys admin at the time felt that snap mirror on the SAN was sufficient as a ‘backup system’ :rolleyes:), nor did we have a disaster recovery site or anything like fault tolerance or redundancy in our system. There were a host of single points of failure. Today, that’s all changed and we are constantly expanding our systems to provide continuity of operations and backups of not only our data but our systems. And we are storing and archiving data longer and more secure.
Does this mean that all data everywhere will be kept forever? Nope. But I think that, saving a complete collapse of our technological society, more stuff will be kept longer and more secure than at any time in the past. The fact that data can and does move basically world wide, and that companies actually use sites in different countries or geographically dispersed locations in the same country to store their data and spin up their disaster recovery sites means there is a good chance that a lot will be saved even in the event of a disaster.
One thing you might not be aware of is that just because a URL no longer works doesn’t mean that a site is actually gone forever. Same goes for an eBook, article or whatever. A lot of times the links are broken because people didn’t continue to support the DNS links or maintain the domains, but the data is still there. I’ve run across some old stuff crawling through the web that you need to use fairly esoteric ways to get too. Also, just like with regular books, because a company might not have a copy anymore doesn’t mean they are all gone…with potential world wide distribution, odds are that someone has it somewhere on earth, and probably a few folks do. Doesn’t mean it will survive through the ages, but it might if it’s got even a small following. I actually got an audio book that wasn’t available from the publisher anymore through a BT site because there were some guys who had read it and pirated the audio, and it was available on one of their archive mirror sites.
I think the idea that “the Internet is forever” is really about stuff you put out there - as in, whatever you post online will remain there for pretty much your whole life. I got a tough lesson in this when I posted a copy of my (post-college) resume on my GeoCities site so I could access it from job interviews and such if I needed to. One day I Googled myself and realized that even though Geocities had shut down, OoCities had archived my 10 year old resume and it came up on the first page of Google hits - including my (old) address and current telephone number.
The recent Snapchat affair is instructive here. Apparently, they tried to offer a service to provide delivery of messages and files, with a guarantee that the material would disappear again once delivered.
(Obviously useful in delivering legal material which is very sensitive…and for illegal material.)
But the technology didn’t work as advertised; there were simply too many ways for people to retrieve the “deleted” material.
(This is just from a radio news headline story I heard today; if I’m getting it all wrong, I apologize.)
I’m sure that someone in the past said something similar about blimps or the printing press. Don’t expect the internet as we know it today (with its protocols, addressing conventions, etc.) to be around forever. A better mousetrap will be created by someone.
Would it all survive worldwide EMP attacks? I’ve also wondered about that. Almost everything is digital anymore; it would be a inconceivable blow if everything got wiped.