I’ve been told that the main reason we’re all switching to 64 bit operating systems is that a 32 bit processor can only address about 4 gigabytes of memory (because 2^32 is about 4 billion). I get that.
AND YET…the original Nintendo NES was an 8 bit console, and in theory it shouldn’t have more than 255 bits of memory (2^8-1) but, according to Wikipedia, it had like 2kb of memory. Also, the SNES, which was a 16-bit console, had 128kb of memory, even though 2^16 is only 64kb.
The Wikipedia article talks about hacks and stuff to get around this problem. But I don’t get it. Can someone explain this to me? Why couldn’t a normal 32 bit processor take advantage of the same hacks and work-arounds to address more than 4gigs of memory?
Be warned that I’m not an expert on old computers, but I think I understand the process. They used “Bus Switching.”
It essentially works like a CD changer, you can only have one disk playing in a disk player at a time, right? But you can have way more than 1 disk’s worth of songs with a CD changer, you have two or three CDs in there, and with the press of a button you can switch which CD you’re currently using.
It’s pretty much the same with Bus switching – you have multiple sets of RAM (or ROM)*, and some way to switch between them. That way even though you can only access 256 locations in memory quickly, if you switch buses you effectively double that.
Now, in reality there are considerations that make this a bit more complicated. For instance, your currently executing code is probably in memory. How do you solve this? Well, the CD changer analogy breaks down here. What you do is you don’t really have 256*2 slots of memory, you have some slightly smaller amount that share a part, so that you can save state between bus switches.
So if we can just bus switch, why bother to moving more bits? Well, like I said, I’m not an expert. For one I assume it’s simply slower to have to deal with a bus switch. The system is also more complex, meaning things could go wrong more readily. There’s probably a host of other problems too. One thing to note (something that’s not really that big of an issue moving from 32-bit to 64-bit, but certainly could have been a problem moving from 8-bit to 16-bit) is that you have to be careful with memory organization.
You have to be careful that you don’t have a scenario where the things loaded into memory contain an asset that’s spread about two buses. Say you’ve filled 248 bits in memory and have a 16-bit image to store. In 8-bit, even with bus switching, that’s no good, you have to find an 8-bit value to store in that slot and put your image on the other bus. In 16-bits, you can just put it there.
I’m handwaving here a bit. For ROM it’s not that you have two slots, so much as you have two separate “CD drives” (pins that read the ROM) that you switch between, I think.
Jragon: Wonderful explanation, except it’s called bank switching.
what do I type here: The CPU used in the NES had 16-bit addresses, which was common for eight-bit CPUs. Sixteen-bit CPUs generally had 16-bit addresses, though.
There’s other benefits to 64-bit beyond the ability to address a large amount of memory. A simple one is the ability to hold the number of seconds since January 1st, 1970. There’s a bunch of legacy code which assumes that a time value can be held in a single integer variable. Changing the code to use a larger integer variable that takes 64 bits is a pretty easy affair. A lot of cryptographic and other math-focused applications can run faster with 64-bits since there’s a lot more room before you have to start using BigInt libraries that split a number over an array of variables.
In several ways, it was just time for the x86 platform to be expanded to a 64-bit processor. They could have added extensions to the standard for a more complex bus system or other ways to stretch the shelf life of 32-bit, but there really wasn’t any value in it. People were already feeling like Intel and AMD were dragging their feet at the time the first 64-bit standard was released. Running out of memory was just the thing that forced the issue to be decided.
Limited payoff compared to the complexity, at a guess. 32-bit itself was nearly sufficient for all purposes as is, for decades. With 64-bits, as example, you can reference any single byte on an 18.5 million terabyte hard drive. At the current rate of progression of about 2X the maximum purchaseable hard drive space per 5 years, a hard drive of that size shouldn’t become available for another 120 years or so.
A 2048-bit processor might be useful for cryptography, but if you wanted a processor for cryptography, you’d do better to make it a processor which can perform actual cryptographic functions in hardware, as an add-on for your system. I’m pretty sure such a thing already exists.
Also, another way you can emulate a higher bit level is with SIMD instructions. That stands for “Single Instruction, Multiple Data.” It’s effectively a way of saying “yeah… technically you can only use 64 bits in a register, but we have a fancy function that will let you use two registers for this at once.” But this can’t be used for memory access as far as I know.
I believe modern processors use SIMD instructions already for math, so you effectively have 128-bit calculation (if you want it at least). This isn’t including GPUs, which are essentially giant blocks of pure optimized SIMD (which is why people say they’re “optimized for matrix operations”, because they can do all the crazy stuff in parallel).
Reply – I think the reason is just a cost reason, it’s more expensive to make a 128-bit processor, so there’s no point unless it’s needed. I’m a computer scientists, not a computer engineer, so this is where my knowledge really starts to break down, however. I only know enough about the physical level to get by. But usually when stuff like this happens it’s just a cost/benefit reason.
I understand AMD is developing a 128-bit processor, but that it has no real practical benefit in performance, so it’s mostly just engineer thumb twiddling unless you’re doing some hardcore military research or something that will actually use over 64 bits of addressable memory.
I, for one, am incredibly terrified of the doomsday scenario that is the 64-bit POSIX wraparound date. Won’t somebody think of the children!? Or at least the cosmic dust that will still exist by that point?
There are various ways to increase address space. A most famous example is the Intel 8086 circa 1978 which used “segment registers” to allow 2^20 bytes (1 Megabyte) on a 16-bit machine. The concept of segment registers could have provided more than just 4 extra address bits from the get-go; that it didn’t is an example of how designers in that era hugely underestimated the coming rapid growth in memory sizes.
About that time I did initial paper designs, both circuitry and modifications to IBM’s MVS, that would allow top-of-the-line IBM 370’s to address (via MVS paging algorithm) memories above 16 Mbytes, but of course IBM soon obviated the need for that.
During the late 1970’s I witnessed a variety of bumpings into hard and soft addressing limitations. IBM needed to modify DOS/VS to run on 8 megabytes – initialization code thought that size was a negative number! The 370/138 needed its microcode modified to display more than 1 megabyte from the diagnostic console – that machine is so tight for work storage, the high nybl of the address was used as a line count in that routine! I’d give more examples, but a smart Googler might then be able to discover my secret identity.
Actually a couple of things should be clarified. First off the chip in the NES (6502 variant) maybe have an 8 bit data bus but it has 16-bit addressing so that can refer to 2^16 locations or 65536. Also computers don’t work on individual bits, they work on groups of bits called a byte.(Usually 8 bits which is what the 6502 does.) So put together the memory space that the chip in the NES natively supports is 64Kilobytes.(65536 bytes.) Of course I don’t know if it was crippled in the NES and could actually support that memory. (If I remember correctly the Genesis in theory could do 24-bit addressing which is 16 megabytes but was wired up for only 5 megabytes. Actually I just looked it up and the SNES’ address bus is 24 bit as well so it supports 16Megabytes natively but who knows how the cartridge port is wired up.)
actually, x86 processors since the Pentium Pro have supported 36-bit addressing through PAE and could theoretically use up to 64 GB of memory. Unfortunately, in the consumer space some notorious hardware manufacturers (hi, nvidia!) assumed there would be no memory addresses in use above 4 GB when they wrote their drivers, so if a system had more than 4 GB of memory the driver could crash and take the whole system down. So 32-bit non-server versions of Windows are forever doomed to see no more than 4 GB of memory.
It isn’t really about addressing, it’s about bandwidth. The wider the memory bus, the more data that can be sent out in each memory cycle. From a CPU perspective, data isn’t just the data stored in memory, it is also the memory address which has to be transmitted before memory and data is read or written. A 64 bit memory bus allows an address up to 64 bits wide to be sent out in one cycle, and up to 64 bits of memory to be read or written in one cycle (I’m speaking of logical cycles which may or not correspond to the actual physical cycles that a CPU uses). It should be noted that a processor of any bit width may not use that full width on it’s memory bus. As an example was the Interl 8088 chip was a logical equivalent to the 16 bit 8086 chip. IIRC it had a 16 bit address bus, but only an 8 bit data bus. It took two cycles to load each 16 bit word from memory in two pieces.
It takes 64 wires to handle a 64-bit address. That means you need 64 traces (lines on the circuit board) from the CPU socket to the memory controller, bus controller, etc. And it takes 64 pins on the CPU socket to connect those 64 traces to the CPU. So it’s extremely wasteful and expensive to have a wider memory bus than necessary.
x86 CPUs have had a 64-bit memory bus since the Pentium Pro, up until the Athlon 64 and later the Core i-series. Data bus width does not have much if anything to do with the “bitness” of the CPU.
The 8088 was advertised as a 16 bit processor, but in terms of bandwidth was only an 8 bit processor. My point is that there is a difference between the bit width a CPU uses internally, and the bit width of it’s memory bus, which also may have differing widths for addresses and data.
To get more directly to the point of the OP, there isn’t enough memory in the world to be addressed by 64 bits, so the increase in bit width is not really about addressing more memory, it’s about increasing bandwidth for all memory operations.
The address width need not have much to do with the data width. As noted above, many 8 bit CPU designs had 16 bit addresses. If you want a very regular instruction set making the two the same width helps. Then your registers can become general purpose, and you can treat data values and addresses the same, and use many of the same instructions on them. But many many CPU designs don’t do this. And further, as noted, actual implementations can, and do, simply drop some of the top bits off. Since there is no computer on the planet large enough to hold enough memory to require a physical 64 bit address, CPUs simply don’t implement the top order bits on the memory control bus. Modern CPUs have the memory controller on chip anyway. Off chip the interface only needs enough bits to address the physical memory likely to be installed.
The number of machines with totally different data and address widths abounds. They need not even be powers of two. Say a CDC Cyber series, 60 bit data, 24 bit address. Different registers for each. DSP chips can be even worse. Even an x86 family chip is weird, with 80 bit floating point and 128 bit wide SIMD registers. And so it goes.
If you want a real headache have a look at a PIC. True Harvard architecture - two separate address spaces - one for data, one for program code. Different address widths for each. Utterly bizarre mechanisms to get additional bits into the addresses. You could be forgiven for wondering what sort of peculiarly bad hallucinogenic drugs the designers were imbibing. But your life is filled with them, everywhere.
Now, there are reasons for having very large virtual addresses. Single address space operating systems (SASOS) embed the entire world in the address space. There is no file system, no abstraction outside a flat address space. However apart from the IBM AS400 and its successors these have mostly remained research ideas.
One needs to distinguish between the virtual address width handled and the physical address width, and the data width. These are all different. No x86 chip does or ever has supported a 64 bit physical address width. 64 bit virtual addresses are not fully supported either, being subject to some interesting rules.
All x86-64 machines have instructions and registers that handle 64 bit quantities, and memory controller interfaces that will treat 64 bits as the atom of transfer (at least as far as the cache controller - once here transfers to and from memory will occur in cache line widths, which is a multiple of the data width.)
The actual transfer mechanism on the bus from the CPU to memory is a separate miracle again. You can see 8 bit wide memory buses on modern systems. This is simply because wider buses result in so much clock skew that they are slower than a narrower bus that can be clocked much faster. The architecture of the memory bus is quite independent of the CPU architecture. Most transfers occur as cache lines, not as base data values, and are thus even further removed from the CPU’s data width.
As a side note, one of the big benefits of the 64-bit systems is that programs run faster. They do this despite the programs being bigger (needing more bytes of memory to encode the instructions, since the instructions often include 64-bit addresses rather than 32-bit addresses).
The reason for this is “because they could”.
For quite a long time now, we’ve been restricted to a relatively old CPU instruction set, providing backwards compatibility. They made the CPUs faster, and added some new stuff, but no major instruction set improvements (other than floating point math), so that old programs would continue to run, and that programs compiled today could run on older CPUs.
Well, when they decided to pull the trigger on an architecture change, they took the opportunity to add a lot more internal “general purpose registers”. These babies are really fast and right there for the CPU to use with no latency (other than pipelining, let’s not go there). Having all these registers makes it easier for the compiler to generate far more efficient code; not storing stuff to memory and reading it back when needed a moment later.
It’s a shame they didn’t add even more registers. We won’t need to do the bit-width expansion again for quite a while.
I started out on 12-bit computers, then “advanced” to 8-bit and 16-bit (both of which had 16-bit addresses, from the programmer’s point of view … sort of anyway). Well I remember the hassles as we switched from 16 bits to 32 bits (independently for addresses and data, unfortunately complicating matters). And now the switch to 64 bits, handled considerably better than the earlier one.
According to Moore’s law, we essentially use one more bit every 18 months (or whatever). Was it 24 years between the last changeup and this one? Close enough, I believe. If the trend holds, it’ll be 32 * 1.5 = 48 years before the next one, so at 55 I’m off the hook!
sigh data bus width does not determine what “bitness” the CPU is. The “bitness” of the CPU is generally held to be the width of the main registers and the width of the ALU(s.) The 8088 was a 16-bit CPU because its GPRs were 16 bits wide, and the ALU had a data word size of 16 bits. That Intel lopped off half of the bus pins to make it cheaper does not make it an 8-bit CPU in any way, shape, or form. Besides, even though it was a 16-bit CPU it had (kludgy) 20-bit addressing to support up to 1 MB of memory.
Another example- the 64-bit PowerPC 970 (G5) had two 32-bit unidirectional buses (one read, one write) to the system/memory controller. Does that make it a 32-bit CPU? No.
You are still confusing bus width and memory addressing. stop it. The width of the data bus has nothing to do with the amount of addressable memory. The width of the data bus in part defines bandwidth. The “Bitness” of the processor’s ALUs and AGUs defines how much memory it can address.