You, being a friend of mankind, want to make sure the literary works of the past and present are still available to humans in 10,000 years, so you decide to assemble a huge digital library and put it in a time capsule to be opened in 10,000 years.
You obviously need to use some sort of digital media to make the project feasible with respect to containment size. In putting together this enormous digital library what kind of digital media do you use that will still be readable and accessible in 10,000 years? What kind of media and it’s readers could work reliably after an eon in storage?
The longnow foundation has produced prototypes of their rosetta disk, a 2.8 inch nickel disk engraved with 200,000 pages of text at nano-scale. “the result is immune to water damage, able to withstand high temperatures, and unaffected by electromagnetic radiation. This makes it an ideal backup for a long-term text image archive. Also, since the encoding is a physical image (no 1’s or 0’s), there is no platform or format dependency, guaranteeing readability despite changes in digital operating systems, applications, and compression algorithms.”
Mind you, if you insisted on using a binary representation of your data, simply carving it as pits into a smooth surface would do for storage. After all, that’s what pre-pressed optical discs such as CDs do. Reading the data might be interesting in the future though. You need a substrate that is stable over a long time, and is fine-grained enough to support your chosen information density. (What if you carved each bit to be the size of your thumb, but you had a mountainside to do it on? You could carve it on a much coarser-grained material than if you wanted microscopic pits…)
This is meaningless. The format it’s dependent on is knowledge of the language it’s written in. That doesn’t change simply because they chose an analog format instead of a digital one. What they have done is, at best, no better than storing a lot more information in a compressed digital format on that disk alongside complete instructions about how to decompress it on the same disk.
So that’s your answer: Don’t try to preserve a mechanism, just preserve as much knowledge as you can in a durable physical form. But even that is a dead end, in the long term. The only true path to long-term stability is continuous translation into newer formats, and the only way to do that losslessly is to do it digitally, where you don’t lose fidelity with every copy made.
No thats wrong. The rosetta disk is readable using a 200x pocket magnifier and the instructions for this are readable without one (they include large text on the outside that spirals down to the nano scale). Thats a lot simpler than having to construct any kind of digital machine, that type of magnifier could have been made with 1600’s technology.
The rosetta disk has details of 1500 languages and is designed to be a resource to reconstruct the others. If knowledge of ANY of the languages on this survive then that one language can be used to work out the others.
Store one of these alongside your collection of books engraved in english at 200,000 pages per disk and your ancestors in 10,000 years can with enough time read it with a hand magnifier and no other tech needed.
Derleth touches here upon a major problem, but doesn’t elaborate. We can’t assume that civilization on Earth 10000 years in the future will have any recognizable semblance to our civilizations today. There can be no assumption that any trace of any of today’s languages, or ANY of today’s literature, will be existing, readable, and comprehensible 10000 years from now.
I read an article somewhere, many years ago, that discussed a big problem with the long-term safe storage of long-lasting nuclear waste. We may encapsulate it inside massive steel-and-or-concrete barrels, and bury those miles down inside of Yucca Mountain or wherever. But we can be sure that someday, in the distant future, people (or whatever) will discover the stuff.
It’s essential to have warning signs posted, warning the future beings of the danger of poking into those vaults and barrels. The gist of the article was that we have no known reliable way of recording information about the dangers of long-lasting radiowaste, and posting warning signs on-site. That is, we have no known way to record all that information in any way that we can reliably expect someone to be able to understand 10000 years from now.
Unless we’re talking about a new species arriving on earth from somewhere else then any language spoken in 10,000 years will be a descendent of one of our currently spoken languages and theres a good chance it will be a descendent of one of the eight major languages spoken today. Anyway this is a different question to what the OP asked, which is just how to store an archive that will last 10,000 years.
As for Nuclear waste needing to posted with warnings lasting that long, its nonsense. Nuclear waste is valuable, newer reactor designs can get additional energy from “waste”. Everything thats stored in Yucca will get taken out and used again within 100 years as the potential energy stored in becomes more valuable.
We aren’t assuming their civilization is anything like ours. It doesn’t have to be. With Rosetta stone like multiple languages and a big enough library, no prior knowlege of say, 2012’s english is needed with enough materials and context to figure things out.
edit: I don’t have enough knowlege to comment about readability in 10,000 years, but assuming we give them tons of materials and context it’s reasonable to surmise that they will be able to figure things out fairly well. After all, they have all the time in the world to study it after 10,000 years.
There’s somewhat of an assumption that finders of said archive will be inclined to regard it as a source of information, rather than, say, a collection of trinkets, useful metals for making spearheads, or just random crap cluttering up a very useful-looking container.
That’s a problem for any format though, of course.
Sure, simply engrave your stone tablet with zeros and ones. I’m sure a more elegant engineering solution than that could be found, but as others have pointed out, the real trick is to ensure the artefact is recognised as a time capsule and can be translated. Please bear in mind, these time capsules will only be useful if there is some sort of catastrophe, so they need to be designed with that in mind. Current digital media has a short life, but is easily copied.
A point you see made in various science fiction stories is that there are all sorts of mathematical & scientific facts that are not culturally dependent. You can include in your records labeled images & diagrams & so forth of those facts to give them a starting point. A simple example would be a single dot with a “1” next to it, two dots with a “2” and so on to give them our numbers. An image of a man and a woman, labeled as such and various body parts also labeled. A diagram of the Solar system, with each planet labeled and a stick figure human on Earth - we’ve been using stick figures since the Stone Age, or descendents should recognize that easily. That sort of thing won’t cover our whole language of course, but it’ll give anyone trying to figure it out enough context to have a huge head start.
An archive written in an incomprehensible language is useless as an archive, and might not even be recognized as such, and 10,000 years is plenty of time for all current languages to have changed out of all recognition. Old English, the ancestor of modern English, was still being spoken less than one thousand years ago, and is effectively incomprehensible to a modern English speaker, although there is sufficient cultural continuity between then and now to allow experts to translate it. On the other hand there are written languages such as Etruscan that have long since become quite unreadable, even without any very sharp breaks in cultural tradition. Etruscan would have still been in use well under three thousand years ago, and the last person able to read it is said to have been the Roman Emperor Claudius, who died in 54 AD. There are lots of other examples of ancient writing systems that are now unreadable. Some were probably in use much more recently even than Etruscan. (Inca quipu may be an example; or they may not be because, as we can’t read them, except to a very limited degree, we can’t really be sure whether they constituted a complete writing system, or were just for accounting.)
In short, I think there is no way of preserving meaningful, readable information of the sort of complexity that the OP envisages, and for the length of time the OP hopes for, just through the preservation of a physical artifact. Some degree of cultural continuity (ideally, transcribing and translating it into more modern language at least every few centuries) is essential.
I will bet you we have much better and more extensive clues than that with respect to Etruscan, Quipu, Indus valley script, etc., and we still can’t read them.
Something like this was done for broadcasts from Earth - and you’re right - it’s quite possible to create self-unpacking documents, where you only need fundamental math knowledge of things like integers, prime numbers, addition, etc to get started.
I started a thread a long while back where I tried to interpret one from cold myself (obviously with the advantage that a) I knew it contained information of some sort and b) I’m human, although I don’t think that helped a lot)
I disagree, digital media is exactly the way to go; I just think your definition of “digital media” is too narrow.
“Digital” implies information encoded in such a way that when it is read, the resulting signal is interpeted as being in one of two or more possible states. The alternative is analog media, in which the information is read and reported as its exact value (subject to the precision of the original recording and the precision of the reading device).
“Two or more” categories is important. We’re used to using “digital” to refer almost exclusively to binary systems (most modern computers), but it can refer to anything that meets the definition in the previou paragraph. Shakespeare recorded his plays in digital media: written text meets the definition, in that the author writes letters onto a page, and the reader - instead of measuring each character and reporting an analog value - says to himself unconsciously, “that letter is an ‘A’, or it’s a ‘B’, or it’s a ‘C’, or…”
Whereas analog media is vulnerable to noise and measurement error during the recording and reading phase, and degradation during storage, digital media is much more robust. If a renaissance painting gets a little dirty, it affects what you see and interpret; if Shakespeare’s written plays get a little dirty, you can still read exactly what he wrote. There is of course a limit: too much noise (or too little signal), and you fall off of the digital cliff. But with a robustly designed digital media, it will tolerate a lot of noise - and deliver flawless clarity all the while - before that happens.
Like Shakespeare’s plays - and every other written word in the history of the human race - the Rosetta disk most certainly is digital media. They’ve etched letters onto the disk, intending that those letters be read and interpreted in a digital manner by the end user. Smear it with dirt, put small scratches across it, oxidize it a bit, and as long as you can still identify what the letters are - even if they are no longer perfect renderings of those letters - you’ll be able to recover all of the recorded information perfectly.
How many of those do we have thousands of pages of content with markers that give clues, though? I would assume most lost languages would most often have two features, 1) not a huge amount of materials available for study, and 2) lack of context for the materials we have recovered, and 3) as far as I know, none of them were intentionally meant to be recovered and deciphered for future generations (humanity is only now comfortable enough to take the luxury to bother with practicality of helping humans thousands of years in the future).
Honestly what language can we not read at all that we have thousands and thousands of pages of material of? I can’t think of any and I’d love to learn something. (I know there are a few probably hoaxes…)
But the OP is talking about preserving literature, with all the nuances and levels of meaning contained therein. It’s difficult enough to preserve that across generations of the same culture, and translations to other languages or to more modern versions of the same language inevitably lose something. Even without translation, we lose the context. We can still read Dickens or Thackeray in the original, but how many of us really get it like people did at the time? Hardcore Dickens scholars, maybe. Go further back through Shakespeare, Chaucer, Beowulf etc. and it becomes progressively harder to claim that we have truly preserved the full meaning of the work.
The amount of text in itself makes no difference, 200,000 pages of meaningless marks are just as indecipherable as one page of meaningless marks unless you have some clue, from some other source, about what some of it might mean. In the case of most dead languages that we can’t read there are often quite a lot of clues. A lot of what survives is inscriptions on objects, gravestones and the like. You can make a pretty good guess at the sorts of things that might be written on someone’s gravestone, yet it is still not enough. We still can’t decipher the language.
We may not have the equivalent of “thousands of pages” of any of these dead scripts, but I think we have a fair amount of Etruscan,and of Minoan linear-A, and quite a lot of quipu. We may not have that much Indus Valley script. But anyway, as I say, amount is not the issue, context and cultural continuity is the issue, yet the OP is envisaging a very decontextualized text, something like a metal disc buried in a “time-capsule”.
Note that 10,000 years is a lot further away from us in time than we are from the very earliest civilizations and the earliest writing systems, which arose only about 6,500 years ago.
I really don’t think this is true. I’m fascinated both by languages and cryptography, and if there IS a meaningful message that is encoded in your crypt, the larger the amount of material you have to work with the better you can study it to attempt to decipher. This works the same way in languages where if you are given one symbol it can mean absolutely anything, but possibilities for understanding open up given more material.
For example, if we have enough material that we come across a list, it is quite possible to learn the language’s number system. If we are confident we are looking at numbers, we might learn if the language (or at least the numbers) are phonetic, or pictographic, or maybe even if they count in base-ten (some cultures don’t).
The more material you have to work with, the more clues you will find even if you are starting out with zero understanding. Check out the Voynich Manuscript for a bad example of this (the fact that we know so little about what the text says is good evidence it is a cipher and not natural writing).
I can’t seem to find any documentation with the actual text from the Rosetta disk online but it seems to me it would be pretty simple to include a couple hundred pages of a language primer using stick figures and diagrams, and grammar rules and that with that and a dictionary and all the other texts future linguists could reconstruct English from scratch given enough time and bodies thrown at it.
As has been said we never found an elementary language textbook for Etruscan, only texts. Somehow I think that would make a difference.