Gigabytes, gigabits, gibibytes, etc.

After looking at hard drives to add to my computer (running out of space), it occurred to me that while I seem to have been told that a gigabyte is 1024 (2[sup]10[/sup]) megabytes, someone has told hard drive manufacturers differently. A friend of mine pointed out that he believes they’re defining one gigabyte as 1000 megabytes or perhaps just 1,000,000,000 bytes. What’s the correct definition? Why do hard drive manufacturers define it differently than I do? Which media use the same definition that I do, and which use the mutant numbering scheme of hard drives? Which do flash media use? DVDs? HD-DVD? Blu-Ray?

And most importantly, why can’t everyone just agree on a standard?

I happened to come across this the other day: In storage media, capacities are expressed in powers of ten. In the rest of the computing world, it’s in powers of 2[sup]10[/sup]. So what you and I as tech folk call a gigabyte (2[sup]30[/sup] bytes) is apparantly properly called a gigabibyte (GiB).

However, I have only read this on Wikipedia. It cites IEEE 1541 as a standard defining the engineering prefixes for binary values; I have yet to read the IEEE standard.

Marketing. It’s all marketing. Like other terms such as “low-fat cookies” or “long-life lightbulbs”, these terms can mean anything that the advertisers want them to mean. Eventually, some gov’t agency or industry group will set some sort of standard meaning, but until then it is every man for himself.

http://en.wikipedia.org/wiki/Binary_prefix

Basically, the 1024 = kilo convention is non-standard and hard drive capacities follow the official SI (International System) standard. CD media and RAM are two formats that use the 1024 measurement, but most other hardware uses decimals.

Given the popular understanding of 1024 = kilo, it is arguable for hard drives to adopt that convention, but the SI standard gives them a decent excuse for providing you a little less.

The SI standard prefix for multiples of 1024 is kibi, mebi, gibi, etc. No one seems to want to use them, and I don’t blame them.

Actual SI nomenclature is decimal: kilo = 1000, mega = 1 million, giga = 1 billion.

As Scuba Ben notes, in many but not all aspects of electronic information storage, the related forms in binary are the default:
Kilo = 2[sup]10[/sup] = 1,024
Mega = 2[sup]20[/sup] = 1,048,576
Giga = 2[sup]30[/sup] = 1,073,741,824
These are used because they are powers of 2, hence logical increments in binary usage, which approximate the related powers of 10, for which there are common SI prefixes.

It’s very short, just a few lines more than two pages. It does distinguish between the gibi- prefix as gigabinary or 2[sup]1000[/sup] versus the giga- which is 10[sup]9[/sup]. The following text is how the standard closes:

Something’s gonna have to give when we get up into tera- and peta-, with their 10% and 13% discrepancies.

It’s mentioned several times in the thread, but just so this typo doesn’t confuse anyone:
gibi- is 2[sup]30[/sup] = 1 073 741 824
2[sup]1000[/sup] = 1.07150861 × 10[sup]301[/sup]

I can cathegorically say we’ll never have that much storage available, and I need not fear being proven wrong.

Thanks for the correction. The document said (2[sup]10[/sup])[sup]3[/sup], which I misread as 2[sup]10[sup]3[/sup][/sup].

Why? They aren’t really wrong and a changeover would cost big bucks and aggrevation to explain what they are doing and why. It just is the way they measure things and it is extremely unlikely that a drive manufacturer will decide to change things on its own. There is no reason they have to express everything in the same measure as others.

Nope. I work on a next generation storage technology and, from what I hear, these discrepancies will be used to our advantage in marketing. I have been led to understand that this is pretty much the thinking of the entire industry.

Though I don’t know exactly when the nomenclature diverged, there is a good reason for it. Memory is inherently binary, in the sense that memory address is done through an n bit address register, which can address 2**n locations. A kilobit of memory is 1024 bits, and it makes no sense for it to be 1,000. Kilo was used for convenience.

Rotating memories like disk or CD are addressed in a totally different way, using tracks and sectors. (At least they were when I learned this during the dark ages.) I’d suspect each disk actually has a slightly different amount of storage, based on how many bad blocks get mapped out, but the stated capacity is guaranteed.

A Megabyte memory actually contains more than a megabyte of memory cells, by the way. To improve yields, there are usually redundant rows and columns, which get swapped in if manufacturing test (I should say when) finds bad cells. Typically a fuse is burned to show this. However, because of the addressing architecture, you never get access to more cells, even if more are available. It is possible to test and repair in the field, when the product is powered on, but this requires an extensive memory test to be run at power up, and is usually not done. (The test that does get run is I suspect rather feeble.)

@ Etyer8 and Shagnasty

But surely some canny memory seller will try to steal a march on its competitors by marketing its 128 Petabyte memory sticks, quite correctly, as “144 Petabytes”? It’s not really worth doing that at the moment.

What is the deal with memory sticks? Since they are implemented with flash, they should use the RAM, not the disk, convention, but if they are considered as part of the external storage market I can see them saving a few bucks by not providing the full load.