Why hexadecimal instead of base-32 or higher?

In computing, what’s so special about base-16 that it’s used as the most common representation of binary numbers? Why didn’t we go with base-32, etc.?

The purpose of hexadecimal notation is to make binary numbers easy to read for humans. 16 is a good number of symbols for three reasons:

  1. It’s a power of two (as is 32) so each hex digit corresponds to an exact number of binary digits.
  2. Base-16 is also a small enough set of symbols to be comprehensible. A base-32 system would require nearly every letter of the alphabet. How much is 3L8P? It’s very hard to calculate in your head.
  3. Each base-16 digit corresponds to four binary digits, but a base-32 digit would correspond to five binary digits. If you have eight-bit bytes, then base-32 would not align on byte boundaries. You would have to go all the way to base-256!

Note that base 64 is sometimes used for compression.

You know, I don’t think that was ever covered in my Into to Computers class. But I can make a guess…

I suspect the main issue is the ease of use. In hexadecimal, the digits are typically 0-1-2-3-4-5-6-7-8-9-A-B-C-D-E-F; 16 single character digits. If you moved up to 32-base, you’d have to extend it all the way up to “U”. Now, if I ask you to convert the hexadecimal number BD, a little math will sort out it’s 189. However, if I ask you to convert the base-32 number WQ, it takes a bit more work. In short, hexadecimal is just easier to use on a routine basis, particularly considering the early days when Assembler was about the most advanced computer language available.

…or, what they said. :confused:

Not for compression - for encoding. Base64 actually makes stuff a lot bigger.

Er… right.

Bigger if we’re comparing to binary data. But it’s not bigger if we’re talking about text formats. In text formats, binary is 8:1, decimal is 3:1, hex is 2:1, and base-64 is 1.5:1.

I think this is the main reason. Octal used to be fairly popular, because 12-bit machines used to be common and 3-bit groupings nicely divide into that. If we had ended up with 10-bit bytes, base-32 would probably be prevalent.

It’s also relatively easy to convert between hex and binary.

Every hex digit is four binary digits :- if you call this a party trick, I have a party trick where I can convert any arbitrary length hex number to binary in my head.

It’s surprisingly easy :- all you have to remember is the conversions of each hex digit to binary, eg

07 -> 0111
0A -> 1010
0E -> 1110

So deadbeef in binary is :-

1101 1110 1010 1101 1011 1110 1110 1111

(I just typed that, hope it’s right).

No, this particular party trick doesn’t impress girls.

Octal is still used in UNIX file permissions, because each file has 3 modes (read, write, execute) for three groups (owner, group, other). So “everyone can read and write, but it’s not executable” is b8_666, “everyone can read, write, and execute” is “b8_777”, “only I can read this” is “b8_400” and so on.

There should have been a [thread closed] following friedo’s comprehensive answer. Nailed it.

Octal was once much more common than hexadecimal, even on machines with, say, 16-bit words (not a multiple of 3). I wonder if IBM’s S/360, with lots of 4-, 8-, and 12-bit fields, was a major turning point.

I’d understood it to be largely the work of Bob Bemer, popularizing the concept that “powers of 2 are magic” which, when taking the characters and punctuation in American English into account, pretty firmly narrows in on an 8-bit byte.

Maybe you just haven’t found the right girl yet. There was one girl in the comp sci class where I learned hex->bin. She works for Google now.

I can’t speak to times before the 360, but 100% of IBM’s documentation and system input & output for the 360 used hex, not octal.

I’m pretty sure it was. Before System/360, octal was very common and perhaps almost universal, even in IBM (e.g.- 7040, 7044, 7090, 7094 documentation all used octal notation). It worked well because word length tended to be a multiple of 3: 12, 18, and 36 bit word lengths were very common, and computers typically used word addressing. The System/360 was the first large-scale introduction of byte addressing and word and data path widths that were byte-multiples, so octal would have been awkward and hex was a perfect fit. friedo explained the rest in post #2.

The nature of electrical switches naturally leads to base 2 (ones and zeros). These are bits. The bits have no meaning by themselves until they are grouped into bytes.

It really comes down to two questions.
#1 How many bits will you use in a byte?
#2 How will you display the byte in a way humans can recognize?

Bytes serve three purposes. They represent raw numbers, text, or instructions.

When you’re representing text, you need at least 6 bits per byte (2^6=64) just to represent every letter of the Roman alphabet and the digits 0-9. But you don’t have enough for all the punctuation marks and other marks like a character to mean “end of line” or “end of file”. So they went up to 7 bits per byte (2^7=128) which offered plenty of room for all the letters (upper and lower case) and the digits and punctuation marks and special characters. They called this scheme ASCII (pronounced ASS-key). This worked great, not only for getting text in and out of a computer, but also for sending text over great distances. And to verify that the data was being sent accurately, they added one more bit, the parity bit. So if you want to send 1100101, you have to stick another 0 in front of it, so the total number of 1s is always even. And if I received 11100101 on the other end, I’d know there had been a transmission error. So we end up with 8 bits per byte.

But then there’s the second purpose, instructions. At the machine-code level, you need instructions for the CPU like “add together these two numbers” or “skip over the next five instructions and read the sixth one instead”. And it’s pretty darn clumsy trying to write a complete set of instructions with just 32 choices, or even 64. 128 is much better, and 256 is great. In the 1950s, 1960s, and 1970s, it seemed like 256 instructions were more than enough and you could do anything you could ever want with that much room. So, again, 8 bits per byte was the obvious choice. Modern CPUs use more than 256 instructions, so they use 2-byte “words” or sometimes more, but that’s another story.

8 bits per byte was big enough to get both jobs done, and also it just felt right, since 8 is a power of 2.

Then we turn to the second question, how do you represent those numbers 0-255 in a way humans can recognize?

You could do it with binary, like 01001010 et cetera, but that doesn’t work well for humans. You could use ASCII, but many of those characters are invisible. For a wile, a popular option was to use octal (base 8), which had the advantage of using easily recognized using the digits 0-7, but it was clumsy because each octal digit represents three bits so you needed three of the digits but the third digit never went above 3 yet the first two could go up to 7. Awkward. Then somebody hit on the idea of using base 16, Hexadecimal (Hex for short) with the digits 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F. Now you could use just two digits to represent a byte.

So, if you’re programming a CPU with machine code and you want to say “take these two numbers and add them together” you just type in something like “C3”. It’s not too hard for you, as a human, to read that, and with practice you might even memorize the code without having to look it up in a table every time.

But if we moved up to base 32, what would that get us? The digits would be harder to understand, and you’d still need two of them to represent a byte. That’s no improvement. How about if we want to represent an instruction with just one symbol? Then you need base 256, which means you need 256 different symbols which humans can recognize. That’s a whole new problem unto itself. So that doesn’t work either.

Base 16, for the win.

Octal was very popular in the 70’s and early 80’s. The computers then were 8 bits (1 octet) wide. Hex got popular when 16 bit computers emerged.

But octal would suck for handling 8 bits, as 8 is not a multiple of 3. Octal is three bit, after all (8=2[sup]3[/sup], so it takes three bits to encode each digit)

Hexadecimal, on the other hand, works great for 8 bit, since it’s a 4-bit system.