The Hard Drive Size Conundrum

Recently, at the OEM Tech Support call center I work at, I had to put together an article for the reps who work for me explaining why Windows seems to misrepresent the size of a hard drive within the properties screen of a local hard drive. Below is the document (minus screen shots of hard drive properties) I put together for them. There is very little information available online about the issue, so I felt impelled to share it with everyone here at the place where all information originates…

WHY IS MY HARD DRIVE NOT AS LARGE AS ADVERTISED?

I’m sure you have all gotten this call. It usually goes something like this……
ordered a 20 GB hard drive, and you sent me an 18 GB hard drive!!!”
First, I can tell you with absolute certainty that no one has ever manufactured an 18 GB hard drive.

Although it is not an actual metric measurement, hard drive space uses regular metric conversions when discussing size. Hard drive space, memory speed, processor speed and various other measurements involved with a computer all use these same measurement conventions. Below is a quick refresher in metric prefixes for those who may need it…

KILO = 1,000
MEGA= 1,000,000
GIGA= 1,000,000,000

These prefixes are used exactly as they are with any metric conversion……
1 Kilometer = 1,000 meters
1 Kilobyte = 1,000 Bytes

, THE FUN BEGINS!!!

Below is a screenshot of the hard drive properties of a local hard drive in Windows XP……This is normally where a customer comes to believe that they have been sent a slightly smaller hard drive than they expected.
(to get here, click on [Start], [My Computer], right-click on the hard drive in question, and choose [properties] from the drop down menu that appears) For Windows 98 or 2K users, double clicking My Computer on the desktop accomplishes the very same goal as clicking on [Start], [My Computer].

Notice the line immediately above the pie chart that reads……
capacity: 20,012,072,960 bytes 18.6 GB
This is the line that, without regard to how full a hard drive is details the exact size of the customer’s hard drive.

Many of you have probably already noticed a discrepancy. If the system has 20,012,072,960 bytes, why does the computer say that the capacity is only 18.6 GBs? We just reviewed our metric conversions and a hard drive with 20,012,072,960 bytes, if 1 billion bytes = 1 GB, should be equal to 20.01 GBs.

The answer is that both “20,012,072,960 bytes” and “18.6 GB” accurately depict the size of the hard drive.

How is that possible?

Because hard drive size is in the eye of the beholder……
To us human-types, using standard Base10 math, 20,012,072,960 bytes is equal to approximately 20GBs. The size of the hard drive the customer purchased.

To Windows, which does not use Base10, the size “appears” differently. Windows uses what is called hexadecimal math, or Base16 math. In Base16, a Kilobyte is not 1000 bytes, but rather 1024 bytes. Why is not important, but basically 1024 is as close to 1000 as you can get using powers of 2. In other words 210 is equal to 1024. The same principle is involved with conversion from KB to Megabytes and from MB to Gigabytes.

To better illustrate this, let’s break down the size from actual bytes to GBs using both Base10 and Base16 math.

                   Base10                        Base16

Total size 20,012,072,960 bytes 20,012,072,960 bytes
÷ 1000 bytes/kb ÷ 1024 bytes/kb
To Kbytes =20,012,072.96 KB =19,543,040 KB
÷1000 kilobytes/MB ÷1024 kilobytes/MB
To Mbytes=20,012.07296 MB =19,085 MB
÷1000 megabytes/GB ÷1024megabytes/GB
To Gbytes =20.01207296 GB =18.6 GB

When a customer calls in with a 20 GB hard drive, you can use this example to walk them through converting 20 billion bytes to GB in Base16, unfortunately not all customers will have a 20 GB hard drive. Walk the customer through the same conversions from the Base16 column using whatever size hard drive they have. Have them read you the number of bytes their computer has from the CAPACITY line in My Computer (see screenshot above) and have them open Calculator and perform the operations themselves.

Piece of cake right?
Now, how do you communicate this to the customer? Here is some verbiage that you might want to try.

“You’re system has 20 Billion bytes (or whatever size HD they have), we see this as 20 gigabytes, because we know that Giga means 1 Billion. Windows, however, uses Hexadecimal, or Base16 math, and reports a smaller number of Gigabytes because to Windows a Gigabyte is actually a little bigger than we see it as.”

Example: To us: (Base10), 1 Gigabyte equals 1,000,000,000 (1 Billion) bytes

To Windows: (Base16), 1 Gigabyte actually equals 1,073,741,824 (1 point Zero-Seven Billion) bytes

TO US: A 20 Gigabyte hard drive = 20GB * 1,000,000,000 Bytes/GB = 20,000,000 bytes

TO Windows: The Same “20 GB hard drive” = 18.6GB * 1,073,741,824 Bytes/GB ~ 20,000,000 bytes

Let’s try this again with a different size hard drive……

Here is a screenshot of the properties of an 80GB hard drive……
OH NO!!! MY 80 GB HARD DRIVE IS ACTUALLY ONLY 74.5 GB!!! (MAJOR OEM MANUFACTURER) SENT ME THE WRONG SIZE HARD DRIVE!!!

First have the customer open Calculator……
Now, have them read to you the number of bytes listed in the CAPACITY line (right above the pie graph)
Have them this number into calculator…

In this case: 80,023,715,840 bytes
Now, the first thing we want to do is convert all of those bytes to kilobytes…using Base16 math……

In Base16 math a kilobyte is equal to 1024 bytes, so to derive the number of kilobytes, we simply need to divide the number of bytes on the system by 1024……. 80,023,715,840 ÷ 1024 = 78,148,160 kilobytes
NOTE: Make sure the customer uses the exact number given to them by calculator to perform the next step, this will prevent rounding issues and make you look like a genius when the final result comes out to exactly what their hard drive size is according to their computer

Great, now we have 78,148,160 kilobytes. So, let’s convert that to Megabytes. To do this, we simply divide the number of kilobytes (which should still be in the calculator window) by 1024, because in Base16 math, there are 1024 kilobytes in a Megabyte……… 78,148,160 ÷ 1024 = 76,316.5625 Megabytes

So now we have 76,316.5625 Megabytes. Great, one last step. To convert those Megabytes to Gigabytes, all we have to do is divide by 1024 again, because in Base16 math, there are 1024 Megabytes in 1 Gigabyte. Using the number still on the calculator, have the customer divide by 1024……76,316.5625 ÷ 1024 = 74.5

So we started with 80,023,715,840 bytes and converted that in Base16 math to 74.5 GBs
Compared to the Graph above……
Capacity: 80,023,715,840 bytes 74.5 GB

The following information is for you only, I don’t recommend trying to convey any of this to the customer as it will probably only confuse them even further….

The issue of Windows displaying hard drive size in a language uncommon to most users has been around since the inception of computing. It wasn’t however a big issue until just recently, because 12-15 years ago when home PCs first starting becoming a trend, the average hard drive size was about 20 Megabytes. In Windows, a 20 Megabyte hard drive would show up as about 19.1 Megabytes, a difference of only 900 kilobytes. Barely noticeable and certainly not an issue worth stressing about. With the advent of much larger hard drives, the difference between the Base10 size of the hard drive and the Base16 size are much more apparent. Per our 80GB hard drive earlier, 5.5 Gigabytes difference, or roughly 6000 times larger than the difference of 900 kilobytes from our 20 Megabyte hard drive example.

What is Base16 math?
Base16 math, which, for the remainder of this document will be referred to as hexadecimal(from hexa, meaning 6 and deca, meaning 10…6 + 10 = 16) is simply a counting system that was initially used by programmers instead of binary, or Base2 math. Most of you are familiar with the Binary system computers use. “1”s and “0”s. To illustrate why a programmer would prefer hexadecimal over binary, take a look at how to write the number 1 million in the three different methods….

Base10 = 1,000,000
Base2 = 00000010010000101111
Base16 = f4240

Big difference huh? Even bigger for programmers. One more example……

1 Billion
Base10 = 1,000,000,000
Base2 = 0000000101001101001
Base16 = 3b9aca00

Much easier, provided you take the time to learn how to read 3b9aca00 as a number.

sorry for the confusion, none of the superscripts copied over as superscripts , and columns detailing the seps to convert a 20 GB HDD in both Base 10 and Base 16 became skewed upon submitting!!!

Well, I wouldn’t exactly say there’s very little information online. A quick google shows people have pretty much been arguing this point since the birth of hard drives.

Petter looks over at his 111 gigabyte HD. 120 gigabytes, my ass. :stuck_out_tongue:

Looks pretty good to me. Might I use it or an adapted version when yet another one of my users complains I ordered the wrong size drive for them?

I would not ratify the notion, embedded in the article, which is that 20 GB really does equal 20,000 MB (and 20,000,000 KB and 20,000,000,000 bytes) and that “Windows reports it wrong”.

Admittedly, that would bring computer size terms into proper alignment with the rest of the world of weights and measures that use those prefixes; but it isn’t that “Windows has a problem”. I happen to think Windows has many bugs and glitches, but this ain’t one of them. Plain old flesh-and-blood computer geeks and other operating systems (e.g., MacOS) are in agreement that a gigabyte is 1024 megabytes, not 1000 MB, and so on. Sizes of individual files are reported by the OS using this system. The allegedly 20 GB hard drive described in the article will not hold 20 1-gigabyte files because those 1-gig files each contain 1024 MB of data. (And each of those megs consists of 1024 KB, and so on down to bytes).

The other day, I recorded myself playing the piano, and subsequently converted the AIFF file to MP3 format to save space. I happened to glance at the open Finder window where the file was being written to disk as it was compressed, and saw the size change…from “989 KB” to “994 KB” to “1003 KB” to “1011 KB” to “1018 KB” and only then to “1.0 MB”.

If it were desirable for operating systems to calculate and display sizes based on the decimalized notion that 1000 bytes is a kilobyte and so on, they would do so. The mere fact that computers are binary and programmers use hexadecimal doesn’t keep computers from being able to figure in base 10! (If this were not so, the markets for Microsoft Excel and QuickBooks would not be very large).

well, in a convention I’ve seen some places, “kb”=1000 bytes, “KB”=1024 bytes, etc.

So, a 20gb hard drive would have 18GB on it…dig? :slight_smile:

Geek here. I agree with AHunter3.

1K = 1024 bytes (this is pretty well established - I’ve never heard anyone claim 1K = 1000 bytes before).
1MB = 1KK = 1024^2 bytes.
1GB = 1KMB = 1024^3 bytes.

I never knew a kilobyte to be anything other than 1024 bytes. It’s strange to me that someone would think otherwise.

Also, I’ve always heard another explanation for the difference between what a HD says on it’s box and what’s reported by the OS. My understanding was that the HD manufacturer sells you a blank, unformatted drive. That capacity is X gig (meg, whatever size). Being manufacturers, and wanting to sell as many drives as possible, they state the largest size possible, which is the unformatted capacity of the drive. Which is, after all, what they are selling you. That would be the number of gig/meg/k of data the drive would hold if you counted every bit, every 1 and 0, on the whole platter(s).

Of course, in order to actually write and store data on a drive, you have to format it, and that formatting structure takes up space, thus reducing the capacity of the drive to hold data.

That always made sense to me. Kinda like the difference in capacity of a concert hall between just an empty room, and a room with chairs and aisles and stuff.

Am I delusional, or is this what other people thought?

Metalhead, you are delusional to the extent that the partition might consme some of the capacity of the hard drive. The capacity of the drive has nothing to do with what is or is not on the drive. The difference in the measurements is do to the use of different number systems.

AHUNTER3
I would not ratify the notion, embedded in the article, which is that 20 GB really does equal 20,000 MB (and 20,000,000 KB and 20,000,000,000 bytes) and that “Windows reports it wrong”.

I don’t recall implying anywhere that “Windows reports it wrong”. Obviously everyone who has been associated with computers for any length of time knows a MB to be 1024 kilobytes. The problem is that not everyone has been so associated with them, and over the last 5 years an unprecidented number of first-time computer users have ordered computers. The manufacturers obviously don’t go out of their way to explain that the “20 GB” hard drive they are advertising is actually only 18.6 gigabytes. The entire point of this document is to explain to someone who doesn’t know what a sector or cluster is why there is a deviation in the size of the hard drive and what is written on the outside of the box.

Meros
This document is 100% original and free for anyone who wants to use it or any deviation of it. If you email me, I will send you the version with the screenshots included.

A lower-case “b” is the symbol for a bit. An upper-case “B” is the symbol for a byte. In SI, only “k” stands for “kilo-” in any sense of the prefix; although “K” is accepted in technology circles are signifying 1,024, don’t expect that to ease the confusion any.

Anyway, a much shorter way of saying everything the OP did would be: a kilobyte is either 1,000 or 1,024 bytes depending on whom you ask.

Hard drive manufacturers are basically the only people who think it’s 1,000, because it makes them look slightly better. The IEC came up with “kibibytes” in 1998 (cite) (e.g., “24 KiB”) as an attempt to end the ambiguity. Good idea, but I’ve never seen anyone actually use this designation. Maybe because saying one has a “55.8 gibibyte” hard drive is goofy.

Of course not. But in fact the “20 GB” hard drive they advertise really is not a 20 GB hard drive at all, and the resentment and sense of being cheated expressed by the customers is entirely legitimate. It isn’t a 20 GB hard drive and it really is an 18.6 GB hard drive, insofar as GBs are composed of 1024 x MBs, and the MBs (all 1024 of them) are composed of 1024 KB apiece, and the kilobytes each consist of 1024 bytes.

The hard drive manufacturers and retailers are indeed guilty of bad faith advertising and should clean up their act.

All so far is true except that derivation of these numbers is from Base2 rather than Base16. Windows, like every other computer, uses Base2 for its math while early programmers, unlike any other humans, wrote code in Base 2 and used Base16 to represent those codes.

The reason for this is (my WAG) two-fold. Firstly, if you were a computer programmer way back you had to be familiar with binary, but binary is difficult to read at any length. (Is this 7-bit or 8-bit code? Damn!!) Hexadecimal condenses 4 digits of binary into 1 digit hex using 16 different characters, instead of just a string of 1’s and 0’s. Secondly, binary and hexadecimal notations align on every 4th power of two, where decimal aligns with binary on very few–if any–powers of two since it is based on powers of 10 (which is not a power of two).

With that clumsy explanation out of the way…

One modern byte (an 'octet') = 8 bits
1 KB = 1 * 2^10 Bytes, or 1024 Bytes

It’s those extra 24 bytes that are causing all this confusion. They begin to add up when multiplied and more so when exponenetiated (?). But way back, 1024 was close enough to 1000 that nobody cared, especially since the discrepancy was positive. (Again, my WAG.)

<hijack>My first computer had 16K RAM, 4K ROM and a 3.3MHz processor.</hijack>

But continuing…

1 MB = 1 * 2^20 Bytes (1,048,576)
1 GB Should = 1 * 2^30 Bytes (1,073,741,824)

1 GB x 20 should be 21,474,836,480 Bytes… more even than the OP says the manufacturer claimed.

As has already been mentioned in this thread, at some point hard-drive manufacturers realized that if they switched to metric terms (Base 10) instead of the already well established terms, they could market their drives as being larger than they truly were. Thus 1MB became 48,576 bytes smaller, and 1GB (when they finally reached that point) became 73,741,824 bytes smaller than it truly is.

(I think it was Maxtor that started this, but i have absolutely no backup for that except that it was on a Maxtor drive that I first saw printed “1MB = 1,000,000 Bytes”)

MacDairmuid,
From the original document…
a Kilobyte is not 1000 bytes, but rather 1024 bytes. Why is not important, but basically 1024 is as close to 1000 as you can get using powers of 2.

*To illustrate why a programmer would prefer hexadecimal over binary, take a look at how to write the number 1 million in the three different methods….

Base10 = 1,000,000
Base2 = 00000010010000101111
Base16 = f4240

Big difference huh?*

As for Maxtor starting this, I have never seen any hard drive that did not have the qualifier about 1 MB being 1,000,000 bytes on hte packaging

Isn’t there a different nomenclature for each type of reckoning (MB and MiB or something)?

Well, shit.

I had a three page history of drive storage terminology and technology with approximately 40 cites all written out, and my browser crashed while doing final editing sigh.

Here’s what you need to know. The first hard drive stored 50,000 records of 100 digits; this was a component of the IBM 350 RAMAC (Random Access Method of Accounting and Control) computer, announced in September of 1956. It consisted of fifty 24-ince aluminum plates coated with iron-oxide paint (very similar in formula to the paint used for the Golden Gate Bridge), dripped onto the platters while they spun. The paint was measured in paper cups and dripped through a nylon stocking to smooth out the flow. No joke.

So, the 350 disk subsystem stored digits, 0-9… not bits. Why?

Punch cards. The 350 disk subsystem was developed to fit the needs stated in an IBM report from 1953 suggesting that business customers needed to magnetically store 50,000 punch cards, which held 100 digits (zero through nine) apiece.

This would stand well for the arguments of 1,000,000 bytes being a megabyte, except for one thing… IBM’s documentation for the 350 RAMAC didn’t call the 5,000,000 character storage a “megabyte”. Instead, historians have screwed that one up in the PR rewrites (including IBM’s own people, as seen [here/url). Instead, I’ll take the [url=http://www.mdhc.scu.edu/100th/reyjohnson.htm]word of one of the lead engineers who built the thing.

The concept of a “megabyte” related to a computer was a theoretical evolution upwards from the 1,024 byte kilobyte, which came into use during the early-1970’s microprocessor boom for the description and reference to memory storage banks.

I’ll skip a lot of history, and a lot of documentation about the NIST (a US government organization) tables and how they chose to impose values of 1,000,000 bytes to be a megabyte back in the early 90’s, to set the record in stone. Unfortunately, data storage measures is NOT one of the seven areas of measure guidelines that NIST has authority for. They attepted to rectify the confusion the caused in 1998 with a modified variant, the binary multiples, a concept so pedantic it’s only used in documents referring to the concept itself.

This used terms such as “megabinarybytes”, short-formed to “mebibiytes”, and abbreviated as MiB. (oh, by the way Ludovic… Mb is an abbreviation for megabits, and MB is megabytes… that has nothing to do with the power-of-two issue; mB and mb are nonexistent, as they would refer to millibytes and millibits, neither of which exist).

So, the question is, how long have drive manufacturers used the 1,000,000 megabyte, and were they right to do so?

The answer to the first part of that is: since the advent of the personal computer. One of the first hard drives manufactured in a capacity over 1 megabyte was the “5 megabyte” Tandon TM-602 MFM interface hard drive, with a whopping 612 tracks storing 8,192 bytes per track. That’s 5,013,504 bytes, or 4,896 1024-byte kilobytes. In memory terms, 4.8 MB.

So, the question is why do drive manfacturers persist in using 1-million byte megs? For two reasons… one, they always have due to the ability to present larger numbers for marketing purposes without really lying, and two, because according to NIST, they can.

But I’ll tell you now… any programmer or engineer who referred to 1,000,000 bytes of RAM as a full megabyte would be smacked down in a heartbeat. Storage is a support function of the memory of a computer, not the other way around, and in the opinion most non-storage computer engineers, should follow the same format as the rest of the computing world.

Storage engineers are schmucks, and suffer from the need to exaggerate disk-size.

Interesting fact most people don’t know #1: You also lose “space” on a drive to the system, for the purpose of recording what is stored where on the drive. This is referred to a sa file-allocation table (FAT). Different drive formatting methods use different FAT layouts; Modern Windows variants use FAT-16. Some operating systems (Unix variants in particular) let you tune the size of the FAT when formatting the drive, to better-fit the data you’ll be storing there. Windows, alas, does not do this.

Interesting fact most people don’t know #2: You also lose space to the system reserve, which is held away from the user-accesible disk space to allow for the operating system to have a ‘scratch pad’ of sorts, for use in moving files, defragmenting, and the like. some OSes also allow you to change this percentage. On one very large RAID I built last fall (840 GB on one partition), I was able to reduce the system reserve from 10 percent to 1 percent, returning about 76 GB of space into active use. This left a system reserve of 8.4 GB, which on an operating system with a 2GB filesize limit, and a mostly static (data storage) partition anyway, was plenty. Again, alas, Windows doesn’t let you tune this store size, either.

In any case… here it is, a pretty definitive answer… memory is measured in 1,024 byte kilobytes and 1,024 kilobyte megabytes. Storage is measured in 1,000 byte kilobytes, and 1,000 kilobyte megabytes. Drive manufacturers started doing this well before NIST claimed it to be the standard, since it made the drives look appear ever so slightly larger. Once the factors of storage jumped to the megabyte and gigabyte (and now, terabyte) range, the discrepancy became severely noticeable.

End of story.

It’s ridiculous to use a base-10 measure to count multiples of a base-2 concept such as a byte; this is what the drive makers cannot reconcile with the rest of the technology world.

Also, if anyone wishes to refute what I’ve written, let me know, and I’ll supply my credentials to give you a better idea of my frame of reference.

Cheers,

Crap, sorry. Bad code. Please fix the above link horror by substituting:

here. Instead, I’ll take the word of one of the lead engineers who built the thing.

I meant to hit preview, honest I did…

OK Zenham, I’ll take the bait, but as far as to disagree on “losing drive space” to partitioning. The amount of space consumed by the FAT is so insignificant, that it would take incredible scrutiny even to realize it is gone.

On your second point, I assume you are talking about page filing? XP does a fairly decent job of managing the page file, and the space lost is nominal compared to the benefit of the speed gained by OS.

As for noticing the space being gone, this used to be less of an issue in Windows to begin with, due to the static FAT size. This has actually been a problem for me in the past due to large folders with tons of tiny files (like CVS transaction logs).

Other partition formats build out file allocation tables which grow based on the size of the drive (unless you tell it to use a fixed inode count); this is where you really note the differences.

In regards to the second point… No, page filing is an actual swap file written to a formatted partition (which can be viewed and deleted if you dare), while reserved space is actually marked as ‘filled space’, without an associated file. Windows relies more heavily on swap files, and yes, XP (and 2k) is much beter thanprevious versions at managing swap file size; it’s still not ideal for my preferences, though.

The real difference here is that systems which make better use of the reserved space do not have to go through the filesystem API to move data around, and thus defragmenting can be done at a much lower system level, rather than relying upon file locks, and block reads and writes at the application level.

This is also why you never have to defragment hard drives under most Unix filesystems; they’re constantly defragmenting themselves in teh backgrounf as the fragmentation is done, instead of waiting for a cathartic defragmentation event. This is also why, if there is available memory, you sometimes don’t see the hard drive lights blink immediately after issuing command that writes to disk; the kernel caches the operation, waits until you’re done telling it what to do, waits a little longer to make sure there’s no more data on the way, then figures out how best to store the data.

WinXP, whether using an NTFS, FAT-16 or FAT-32 formatted drive, uses a fixed size partition table. This means you basically get to choose one of three particular filesystem information layouts, only one of which (NTFS) is recommended and ideal for WinXP.

Try writing a program that creates sequentially named 1-byte, 1k, 4k, 16k, 32k, and 64k files in one folder under WinXP, and see how far you get, and if the usable space on the drive decreases more than the size of the files. You will get decently far before running out of file system handles, but the usable disk space will not shrink faster, because of the nature of the static FAT size.

All in all, I like XP Professional just fine. Best Windows OS yet; I was happy with 2k and still use it at home though. It’s very obviously designed for user machines, not servers; I hope Microsoft continues that trend and separates their product line mroe and more over time, resulting in better fits for both the desktop and server sides.

Cheers,

[QUOTE]
This has actually been a problem for me in the past due to large folders with tons of tiny files (like CVS transaction logs).

Understood, but please understand, whilst doing tech support for home and small business, we rarely run across anyone who can spell CVS, much less know what it is.

I agree with you about XP Pro though, I switched over from 2K to XP Pro mostly to familiarize myself with the subtle nuances of it, which tends to be helpful when the OEM manufacturer you work for offers only the choice of XP Home or XP Pro to all of its home users. There has been a noticable dfference in the amount of calls we receive about OS issues, due primarily to the the fact that XP is superior to any of the Win 9X kernal OSes, particularly Windows ME.

Yes, I saw that. My only–kinda nit-picky and long winded–points were that the number 1024 was derived from the binary and not hexadecimal system, and that Windows does math in Base2 instead of Base16.