Archive the DOS world

In those distant days when MS-DOS/PC-DOS ruled most PCs were restricted to 640KB of RAM and a few MB of disk storage. I remember writing Pascal programs that took up less than 64KB of space. I also once worked on a Terradata mainframe system which at that time had a theoretical database size of, surprise surprise, 1TB, but we all scoffed at that - never going to happen, who could afford that?. Now I can buy a 1TB disk for less than a 30MB HD cost back in the mid 80s - and that’s before factoring in inflation.

This got me wondering, how much storage would be needed to archive all the software produced for MS-DOS systems? Lets define software here as all the executable code plus any ancillary stuff required to run it e.g. graphics files for games, help files and the like. Excluded are the data files produced such as Wordstar documents and Lotus 123 worksheets.

I think it should be restricted to stuff that did not require Windows 3.0 are later, and also exclude Windows 3.0 itself even though it needed DOS to run. Arbitrary I know but you must draw the line somewhere.

Any one want to have a stab at this?

How could you possibly know how much DOS software the entire world produced?

Estimate

Everybody that was interested in computers at that time wrote code. There is no estimating all that.

2 TB, compressed.

Should that include each patch and minor upgrrade as a unique complete set of software?

Sure there is. You get numbers for how many people there were interested in computers at that time, estimate how many hours per week they would spend writing code, estimate how much code they wrote per hour, and multiply them all together. Then multiply that by the number of weeks in the DOS era.

Let’ try to set an upper limit:
From http://www.laynetworks.com/history4.htm the number of PCs in the world was around 100 Million in 1989. It’s probably reasonable to assume that the average PC didn’t have more that 100 MB of data, even with all of the unused floppies sitting around in boxes on shelves, so the entire amount of data in this case was 1x10^8 x 1x10^8 bytes or 1x10^10 Megabytes, which is 10,000 TB.
It was probably much less.

Why not? Estimate how many people were interested in computers and estimate how much, on average, they wrote. Not to mention I would question your basic premise. The IBM PC marked the real beginning of the micro-computer as primarily a tool rather than a play-thing.

Not so.

Between ca. 1983, trying to teach my VIC20 how to solve quadratic equations in BASIC (which I never could do, don’t know why), and ca. 1995, programming extensive macros in Excel’s old pre-VBA macro language, I never bothered to write line of anything anyone could call code even by those ridiculously minor-league standards.

Make reasonable assumptions and estimate.

Have you folks never heard of the classic Fermi questions, such as “How many piano tuners are there in New York City?”

Of course you can estimate.

We’re still in the DOS era, as you put it. It won’t end for quite a while yet. FreeDOS went 1.0 not that long ago and it has an active developer and user community, and it only accounts for a fraction of the total DOS universe at this point.

I’m sure there are still a few hobbyists around, but surely it’s much smaller now than it was at its peak? For purposes of a Fermi estimate, we can probably get away with saying that the DOS era ended in 1995.

I’m so glad you people on this board are capable of that. I await your numbers. Don’t forget you’ll be subject to government recount verification. Estimates are not the actual number so only real data please, and no God told me in a dream results.

I bought one of those first IBM PCs and I recalled that there were only about a few dozen available programs to buy with the PC. From there the number of programs grew extremely fast. So I’m going to assume that the number of programs started at 100 in 1981 when the IBM PC was first introduced to the market. And every year after that, the number of new programs doubled for the next 15 years. That takes us to around 3.3 million programs. How large were the programs? One thing to keep in mind is that the programs had to be delivered on floppy discs for the most part until CDs came in. The biggest programs were language compilers which may had 30 discs or so, each disc storing 1.44MB. So using this higher estimate for the average program size, we arrive at about 140 terabytes. I would go with an estimate of between 10 and 100 terabytes because most programs could easily fit on a single disc. Well, that’s my SWAG.

Huh?

I think an order-of-magnitude estimate is more than close enough to satisfy my curiosity.

I think you are missing something.
Are you saying that there were only 100 programs in the entire world in 1981, or 100 different programs available to purchase?
I think you need to figure out how many computers were in use at the time, like I did above.

>the number of PCs in the world was around 100 Million in 1989

Beowulf gives us a perfectly useful statistic. By '89 windows still had not displaced DOS much, so your typical '89 computer would have a good sampling of DOS software. Around that time I bought a PC with a 25 MB hard drive and so will use that as a reference point too. Suppose the average hard drive then was 10 MB (I don’t mean the average of the new drives selling that year, I mean taking all the computers that were up to a few years old). So there were 1e8 * 1e7 = 1e15 MB of hard disk space deployed in PCs by the end of the primarily DOS era. There would have been room for a thousand terabytes of stuff.

But typical computers weren’t filled with software the owner had written. If you think the typical PC then was 10% full of software that was written on that PC, then there’d be a hundred terabytes of software if you counted every program once (what I understand the OP asks).

I do a fair amount of programming but I don’t think I had more than 250,000 bytes of my own software on that machine. If that is typical, then each PC was 1% full of unique software and there would be 10 terabytes when you add it all up. I think I like this guess best, because of course there were plenty of PCs then that were just running Microsoft Office (or, I think, Word and Excel and maybe other things that weren’t called “Office” yet).

I don’t see how the total could be less than a terabyte or more than a hundred terabytes, really.

Not just hobbyists: DOS is used in embedded systems such as process control computers and point-of-sale terminals and other computers that may or may not ever get turned off.

This is true enough. The mesa for new DOS development probably began its downward trend not long after Windows 95 was released, but the long tail means any estimate we make now is going to drift away from reality as time goes by. Comparisons to radioactive half-lives would not be entirely inapt, especially given how many people regard DOS as toxic sludge.