Why hexadecimal instead of base-32 or higher?

Didn’t Russia experiment with 4-state bits (a switch has four distinct thresholds 0,1,2,3)? It didn’t work out, obviously, but I thought I read something about that.

Are you thinking of the Russian Setun computer? It was called ternary, but actually used two ordinary two-level signals or latches to represent a trit, with a fourth illegal state playing an error-checking role.

[historical anecdote]Another computer which used two bits to represent a trit in one case was a CII-Honeywell-Bull machine which used base-3 parity-checking on main memory. When they upgraded the design to use ECC, the ternary check-datum had to be maintained: Some programmers had taken advantage that they could squeeze an extra bit into a word in 1/3 of the cases. :rolleyes: [/anecdote]

I think it also had a lot to do with the cost of memory, which was astronomical all thru the golden age of mainframe computers (and thru the beginnings of home PCs as well). In terms of powers of 2:

8-bit = 2[sup]8[/sup] = 256 = Too small
32-bit = 2[sup]32[/sup] = 4,294,967,296 = Way, way, way too big!
16-bit = 2[sup]16[/sup] = **65536 **= Just right

IOW 64K (hexadecimal) was in the Goldilocks zone in terms of usefulness vs. cost. Home PCs followed a similar pattern. The original 8086 was 16-bit, but the later, cheaper (and therefore more popular) 8088 was actually 8-bit. The 80286 was 16. The 80386 had several versions to balance cost & performance (internal 32-bit but external 16). Finally with the 80486 (and all the later models) 32-bit became the standard for nearly two decades. Only fairly recently have we finally moved to 64-bit computing.

Eh. Less convenient, but it didn’t suck. You just got used to the fact that a byte would only go up to 377[sub]8[/sub]; a 7-bit ASCII character would only go up to 177[sub]8[/sub]. The advantage of octal was that it was a little easier to remember the bit patterns for 0-7 than for 0-F. Remember, these representations are purely for the convenience of the programmer/user. The computer doesn’t really care.

I have an abacus with five beads in the lower section and two beads in the upper. In base 10 I only use four lower and one upper, but using all the beads it’s base 16, which I understand has been used forever in some transactions in China. I tracked my calls one day in hex, but it was a headache. /aside

Besides the functional advantages of base-16, IBM also wanted to use something different to sell the idea that the 360 was a completely new and different system.

But this is only true for CISC design computers. The RISC (Reduced Instruction Set Computer) designs being pioneered at that time by Seymour Cray in the early supercomputers (and modern graphics chips) could do fine with fewer instruction codes.

And speaking of Seymour Cray –

The entire line of Control Data CDC-6400-6600-7600 mainframes had 60-bit words that were only word-addressable, not byte-addressable. The various register banks came in groups of 8: X0-X7, A0-A7, and B1-B7 (B0 was reserved); so it took 3 bits to specify one particular register in each of these groups. Instruction codes were 6 bits. The address and index registers (Ax and Bx registers) were 18 bits.

So everything about these computers fell neatly into multiples of 3 or 6 bits; so octal was the natural human-readable base for them.

ETA: And the character code used 6-bit characters too, not 8-bit.

It’s nearly as easy in base 32 - each digit then converts to 5 bits. Unfortunately you then have to remember 22 different bit patterns above decimal 9, instead of only 6.

I would say that is significantly less easy, not nearly as easy. And the alignment problems make the idea of using base-32 moot on anything but super-weird architectures.

Bingo.

As I recall, the only 16-bit machine to use octal was the DEC PDP-16, and that was because machine instructions were organized into 3-bit groups. For example, address mode and address index were both 3-bit fields, and certain address offsets less than a full word were some multiple of 3 bits. It was actually a PITA, though, because you’d often want to see both the octal and the hex (because addresses in the instruction stream that were full 16 bit words were far easier to read in hex.)

Yeah, I’d hate to work in base 32! It took a while to learn to add and subtract hex in my head. I doubt I’d ever master 10 numerical digits plus 22 letters. Quick: what’s n + v?

So, I’m very glad that we don’t have 10-bit bytes. That would be even worse than the damn nuisance of little-endian byte order.

Good point. If you went much above base 32, you would have to start picking symbols. What are you going to use? Sure, maybe you can start with Greek and Hebrew letters, but then what? Cyrillic? Arabic? Linear B? Aramaic? Viking Runes? Egyptian Hieroglyphs? Squiggles and Smiley Faces? Can anyone really master so many symbols at the same time?

Quick, how much is 24AלֶΣع9كZ3הWΓש plus ΔД4Ψ4QΠՃקЯ? Don’t forget to carry the ⴴ. Can you divide ႥИ3R0Ж by تاΩ2Θג?

I’m not sure, but I do know it contains %$&%^*!!# at least once in the calculation process. At least it does when I do it :wink:

That’s why proposals to go to higher bases rarely earn grades above Ռ. A few years ago, a master’s thesis proposing a base 128 system earned a GPA-equivalent grade of Վ.መᛗ, but that triggered a review of the school’s academic standards and resulted in the District Խ (section ב) Accreditation Agency issuing a Level Щ revocation warning with a ΞևЛ-month probationary period.

The 16-bit Data General Nova was also generally documented with octal, even though its instruction set did not map nicely to the octal digits.

There is a cause-effect relationship which operates the other way. I know of machines with 16-bit or 32-bit control words in which the controls were organized into nibyls and bytes, e.g. 4+4+4+4, even though there was no hardware reason for it, i.e. when a 4+5+2+5 decoding might have been more efficient.

~ ~ ~ ~ ~ ~

On the other topic, I find it very unlikely that base-32 arithmetic with 32 symbols would be as easy to learn as hexadecimal. Note that human counting systems with bases above 10 were always hybrid systems. The Mayan base-20 in effect alternated base 4 and base 5; Babylonian base-60 alternated 6 and 10.

Oops. I should have remembered that, since I coded DG Nova and Eclipse for a few years. That’s an architecture that did not have direct byte addressability in the hardware.

Mostly what I remember from Nova/Eclipse was that they seemed to have chosen the hard way to do just about everything. Ick.

Well now you’re just being silly.

The PDP-10 and its predecessor the PDP-6 used standard 8-bit ASCII, but there was also a subset used for system functions like filenames that used 6-bit character codes, which were denoted in assembly code with the pseudo-op “SIXBIT”. This scheme allowed 6-character filenames to be encoded in a single 36-bit word (and the 3-character extension in exactly half of another one). That, and the preponderance of architectures that were a multiple of 3 bits (like the 12-bit PDP-8) probably explains the legacy of 6-character filenames that were inherited by DOS and then early versions of Windows and persisted until Windows 95. It was also just about the right length to have meaningful names without wasting precious storage resources.

MS-DOS actually had eight-character filenames (plus a three-char extension.)

Oh, and recently I was working on something where quaternary really would have been handy. There’s a function called SHUFPS on some processors. One register can hold 128-bits worth of stuff, and there are functions that allow you to use all 128-bits at once for 4 32-bit floating point numbers.

The SHUFPS instruction, when used on a single register, allows you to “shuffle” the four numbers, so if I have the vector

[1, 2, 3, 4] you can shuffle it to [4, 2, 1, 3] or even duplicate like [4, 4, 4, 3].

SHUFPS accomplishes this with a single 8-bit number, with each two bits specifying which index the new number will be from. So to go from [1,2,3,4] to [4,3,2,1] you would have [11][10][01][00] or 3 2 1 0 (since we start counting with the first element at zero). The assembler I was using supports decimal, octal, or hex literals, so I had to convert this to 0xE4. It’s admittedly not onerous at all since I’m used to hex, but I really would have preferred to be able to just pass 4b_3210 or 4b_3332.

D’oh! :o I should limit myself to discussions of ancient history like the PDP-8, PDP-11, and PDP-10! I was confused by the recollection that the OS/8 system on the PDP-8 was modeled on the PDP-10 and had the same 6+3 filename structure (even though 8+3 would have been just as easy since it could only store 2 characters per word anyway). Thanks for fighting ignorance – or, in this case, a defective memory!