How many bits of info are stored in all the books at the library of congress? What about not just the books but all forms of info (microfilm, newspapers, magazines, videos, CDs, etc)?
This says about 6 terabyes, but that is an assumption on book length and doesn’t include other forms of media. However I’m guessing the real number is in the 6-12 terabyte range.
I can buy that if you are just talking about raw text. However, if you are talking about images within those books, movies, film strips, art etc. the number goes way, way higher. You also have to decide if just the words are enough. A newspaper from 1820 is probably better represented by digital photos of the pages than just an optical character recognition output of what the text says. The raw text is good for computers and searching but it doesn’t tell the whole story.
You also have to decide how much loss is allowed in compression. I standard length movie can be about 4 gigabytes so you only get 250 of those before you reach a terabyte. You can compress them and lose some quality and put them on about 1/4 the space. That holds true for most images and sound recordings.
Lossless compression of everything in the library is going to run many petabytes (thousands of terabytes which are 1024 gigabytes which are 1024 megabytes)
For reference, 1 petabyte is about 785,365,448,411 of the old style 3.5 inch floppy disks. To answer the question however, you have to know how much compression (and information loss) is allowed. The full storage would be quite large but just the test will be able to fit on an average hard drive in 10 years.
That depends on your standard of “lossless”. Suppose the L of C has in its collection a photograph, and they want to make a lossless digital copy of it. OK, so they use BMP; that’s a lossless format. How much space does it take up? Well, that depends on the resolution. A 30 x 20 BMP won’t take up much space at all, but that’s not a very good image. And you can’t say that you want the BMP to be the same resolution as the original photograph, because a photograph doesn’t have resolution in the same sense that a computer image does. Photographs do have grain, and you could choose a resolution high enough that the digital image has as many pixels as the original had grains. But suppose you want to know something about the shape of the grains? Ask for enough detail like this, and you could fill up your several petabytes on a single image.
The 4gb movie already is compressed into DVD MPEG2. A 2 hour, 720x568, 30fps movie would run 247GB uncompressed.