I think I still have some 5.25 floppies with Wordstar files from a Kaypro running CP/M.
I have an early spreadsheet app on 8 inch floppies still in shrink wrap. I should use it as a paperweight since it is in a nice plastic box.
Which was pretty close to extinct right from the start, since absolutely nothing but Microsoft Works (including Word) could open those files.
Yep. We get people all the time trying to bring in their Works files to open in Word and print, many of whom don’t even call Word “Word,” they call it “Microsoft.”
“It’s the same thing, right? It’s Microsoft!”
How about .com (no not the internet one). I was some sort of executable file used by MS-DOS. I never quite understood why it was different from .exe
Bob
Being a programmer myself, I interpreted the question to mean whether there were any file types for which the file type specification has been lost. That is, even if you had a list of the 1’s and 0’s making up the file, you would have to approach the file as a crypographer approaches a cyphertext rather than develop a method to transform the bitstream based on a requirements specification. A similar issue can apply (and it seems some have treated as in scope) to lower level media formats, such as a type of floppy disk format or magnetic tape format where the relationship between electric charges and the low-level 1’s and 0’s that make up discrete files cannot be determined because nobody knows how to take the magnetic patterns on the disk and identify how many files are stored there, what sizes they are, and get their bitstream because the specification on how to make a reader/writer for the medium has literally been lost and could not be rebuilt without rediscovering the spec, even with an arbitrarily large budget. This is different from being able to take that filestream and extract usable information such as an audio clip, novel, game, picture, movie, or corporate memo from it.
Seems to me that it ought to be possible for a few talented programmers to build software/hardware that can read old files and convert them, and which could also in themselves be future-proofed. Is there not a project out there to do this?
EDIT: Just realised there was a second page to this thread, and I may just have reiterated what someone already said.
Well, as long as you sufficiently document the technology, equipment to read it can be rebuilt at any future time assuming no catastrophic losses of natural resources or something. The problem is sufficiently documenting the technology. When was the last time a specification was actually documented thoroughly?
I wrote a program in FORTRAN, but I have seemed to lost the cards. I have no memory of what the program was supposed to do.
‘Missing’ or someone noticed the camera guy’s shirt sleeve in the corner of the shot? :rolleyes:
<stupid>
Probably a really dumb question; isn’t there a way to send unreadable files, online or physically, to a company with all the software? Then pay them to convert it to be read on a modern system? Would this be so difficult to do?
</stupid>
How much are you willing to pay?
The problem we have today is trivial. We have some data that can’t be read or interpreted that was collected 40 to 50 years ago. It can’t be that much data. 1 megabyte used to be an enormous number, and there weren’t all that many computers, disk drives, tapes, and what-have-you in the world. And among those there was limited amount of data or software that would be worth anything.
But today we collect information at astronomical rates, with an ever expanding variety of applications, operating systems and storage conventions. It doesn’t matter if we store everything on super memory devices that will project themselves into another universe to avoid the heat death in this one. In another 40 or 50 years there will be oceans of unreadable material because the software is gone or impractical to re-impliment even if the data can be physically read. And there’s a much greater chance that we’ve collected something worthwhile saving. Or maybe not, because the singularity thing will make the entire works of mankind worthless. Either way, the problem we have today in recovering data is trivial in comparison to the problems we face in the future.
COM files were single data/code segemented and were nonrelocatable - so they had to be small (sub-64k minus a bit). They also had no header. COM files were preferentially by the shell over EXE files of the same name.
EXE files had multiple and separate data and code segments and had a header with relocation information. They could be bigger. Eventually DOS exe format was extended to the Portable Executable format used in Windows - this has environmental detection (i.e the “This program requires Microsoft Windows” message when run in DOS) and dynamic linking features, as well as relocation.
They are no longer used in Windows, but DOS systems (DOSBox, FreeDOS) still have COM files. And we understand the format, even if it not particularly relevant to modern OSs. I’m guessing COM is still a valid target for a number of compilers.
I’ve done data file format recovery in the past. It involves lots of playing around, checking storage formats (not every Real is stored in IEEE format), looking for repeating markers, figuring out strings, fixed length vs nonfixed, hoping that you understand flags and check bits and the rest.
I did some data records (including serialized objects from Borland TurboPascal) and simple bitmaps (oh, and some password recovery, but that was only an XOR :smack:). I could not begin to attack a compressed object like a JPG
Si
I’ve heard that COM files are popular in some spam/malware emails because you click on hottgirl.com expecting to go to a webpage and instead start running a command file.
This is similar to Microsoft Write, which was abandoned for WordPad in Windows 95. Wordpad could originally open them and then save them as RTF files, but it no longer can.
Fortunately, the solution is the same as for yours: run an emulator, load Windows 95 on it, open and resave. That’s what I’ve done to all my WRI files.
2 Polish Zlotys, 1 euro and 20 Lithuanian litas - saving you the price of exchanging those currencies - and a small cloud which comes in a handy, hand-size, clear plastic package ![]()
A large part of my work for the past year or so has been to reconstruct such a lost file type specification. It’s for an ancient Fortran code for modeling neutron stars. The files are all ASCII text, and contain a whole bunch of decimal-formatted numbers, but nobody knew what any of the numbers meant.
Which leads into JustinC’s question. You can’t just have one big company that has “all the software”, since anyone at all can create software. I doubt there are more than a couple dozen or so people in the world who have a copy of this code I was working with. There’s no way any central entity could get a copy of everything on this scale.
That’s fascinating, what do they want to reconstruct the code for? Who is doing it? I would think that they could model them better with more modern software so I’m pretty curious who wants to use this and for what. ![]()