I am wondering a few things about processors.
I am aware that different programming languages are used, but how is this compiled into something the processor can deal with?
What about the idea that computing power doubles every eighteen months and is there a limit to how fast a processor can potentially run?
Is there any difference between how Intel processors respond and are Intel processors better for certain types of work?
Are the processors used in very fast computers such ones used for large scientific experiments, battle simulations and for weather forecasting different from standard processors used at home?
A program reads the files that contain the code written in the programming language and converts it into a series of instructions that the processor understands. For example, if a programming language statement says “a = b + c” (meaning: add b and c, and store the result in a), then that might be translated into the following instructions:
[ol]
[li]Find out what value b has[/li][li]Find out what value c has[/li][li]Add the two values[/li][li]Store the result in a[/li][/ol]
Physically, a, b, and c are places inside the computer’s memory (either its “RAM” or special memory contained on the CPU chip itself) that hold numbers. The instructions are stored as a series of numbers that the CPU decodes to figure out what to do.
This idea (that computing power will double every 18 months) is popularly called “Moore’s Law.” It’s not going to continue indefinately, and, barring a breakthrough, it’s going to end very soon, if it hasn’t already. Manufacturers are starting to hit physical limitations inherent in the way they currently manufacture computer chips.
All processors are different, however, with some exceptions (e.g., some applications that need lots of RAM require 64-bit processors; PCs only have 32-bit processors), what ultimately matters is how fast a given CPU can perform arithmetic. From the standpoint of someone writing an application, the CPU it will be running on usually doesn’t matter that much (except to the extent that it influences what operating system and languages will be available for it).
It depends. Currently, the fastest supercomputers are made out of thousands of processors. Sometimes these processors are Intel processors similiar to what you’d find in a home PC, sometimes they’re the kind of processors used in traditional Unix workstations and servers from Sun, HP, IBM, etc.
By the time the CPU gets to see the program the original source language is irrelevant. A compiler program translates the source code into machine code which is what the processor chomps on. A machine code instruction will be something very simple like copy a word of data from register A to register B or increment register C. It’s tough enough writing “hello world” in machine code, good luck writing a flight sim or word processor.
[slight hijack] Actually both Java and C# (and VB.Net etc) “compile” to an intermediate language which is converted to machine code at runtime[/hijack]
The limit on CPUs is gonna be the minimum size that you can have viable circuitry. I’m a software guy you need a hardware person to answer this.
CPUs get more powerful by running faster (obviously), by having more on-board (very fast) memory, by growing more powerful instruction sets (I think this is one thing Intel are big on) and by using tricks like predictive processing.
IIRC the real difference between super-computers and regular computers is the speed/bandwidth of communication between the processors rather than the architecture of the processors.
No short way to answer this. You’s have to understand how a microprocessor works at the register level. Basically how a computer manipulates 1s and 0s and its machine instructions for doing so. Compilers break down the tasks in high level languages to lower levels the machine can actually do
Some such as RF problems and the finite amount of time it takes signals to move around even on a microprocessor. I think there are ultimate limits but I’m not placing any bets on exactly what they will be.
The differences are kind of blending away. There used to be a gulf between RISC and CISC processors. RISC is reduced instruction set CPU. Some processors were designed that could do fewer different machine level tasks but could do them much, much more efficiently than a complete instruction set CPU. Compilers would have to combine more of these simple instructions to accomplish the same high level task but the result was often higher overall performance. This isn’t such a big battle anymore since the massive development in the Wintel world have keept CISC chips out front. HP uses RISC chips in its PA series Unix boxes.
As for which is better that is not as clear as it used to be. A CPU doesn’t exist in isolation. A computer is a combination of the CPU, all the bits soldered to it, an operating system and application software so benchmarks between CPUs often are not meaningful a they appear.
One way to look at this evolution is math processing. Traditional CPUs are bad at math, at least decimal math that humans are accustomed to. Translating it to binary is clumsy. If we used a base 8 or base 16 number system this would not be the case but manipulation between decimal and binary is usually somewhat wasteful. Because of this Intel invented a dedicated decimal math co-processor, the 8087, to work along side the 8088/8086 chips. It was a special purpose CPU as opposed to the general purpose 8088. If it was installed and if one was running software that was written to use this coprocessor math operations would be dramatically improved. The key was it had to be written to use the 8087, it wasn’t automatic. When a CPU is doing its register math it doesn’t know that the ultimate goal is decimal math. Some programs such as Autocad required the 8087 to even run. Spreadsheets like Lotus 1-2-3 would check for the presence the 8087 and take advantage if one was present. In subsequent processors there were parallel math chips, the 80286/80287, the 80386/80387 but when the 486 came along the math coprocessor was built in. Still you could buy a less expensive SX version with the math chip disabled. Finally with the Pentium the math processor is an intergral part of the CPU that can’t be separated IIRC so all software can be written to use it.
I’ll talk about the hardware (silicon) part of processor speed improvements, since I see several people are addressing the software.
There are many factors that improve the computing speed of successive generations of processors. I’m going to lump them into three categories:
-
Improvements in chip technology. This is the physical technology – how close the metal wiring is together on the chip, the resistance and capacitance on the chip, and how small the resistors are. (If you are interested, there are two properties to look for in CMOS transistors: The gate width, and the gate oxide thickness. They affect speed and power consumption in slightly different ways, but I’m not going to get more technical than that unless someone is interested.) These result in faster transistors and lower power consumption.
-
Putting more transistors on a chip. This is closely related to the previous point – you need to be able to pack everything tighter to fit more on a chip – but it’s worth understanding this as a separate point. By doing this, you can put more functional units on your chip – specialized circuitry for floating point operations, graphics, or even additional computing units. (Many of the highest performance processors have a lot of parallelism on the chip these days – meaning they have more than one of any critical blocks that get a lot of use.) You can also “pipeline” more, which means dividing tasks and instructions into smaller sub-tasks, each of which now gets done faster. Putting more on a chip means that you get more things done, in a similar way that two people can get more work done than one.
-
Improvements in Architecture. This is when the designers think, “what could we do differently to make the chip perform better on the same silicon?” Adding an on-chip cache was an improvement to the architecture over previous processors that didn’t have it. (Long time ago.) The pipelining I mentioned in the previous point is an example of an architecture change. So is an improved instruction set. Or changing the memory access width because memory speeds aren’t keeping up with processor speeds. And so forth.
As you can see, each of these builds on the previous. Smaller transistors can speed up your processor some. It also allows more on your chip, so adding other blocks can speed up your chip more, in addition to the improved transistor speed. And putting more on the chip means you can add some architectural features that help more than just adding more circuits. I’ve made this description kind of broad, which makes it a little vague, so if you have questions about the details, ask away.
By the way, keep in mind that Moore’s Law (as Metacom said, that’s the statement about power doubling every 18 months) is an observation, not a prescriptive law. That is, there’s nothing inherent to computing that can be figured out and calculated that says, “yeah, it should be doubling this fast”. Rather, Moore noticed it was increasing, graphed it, and came up with that observation. If we hit a technical wall in producing chips, that could change (although we aren’t as close as people think, in my opinion). It could also change due to a big change in demand.
To some extent, it’s now self-perpetuating. It’s been known for so long now, and consistent enough, that the industry has grown to trust it. Now, corporations use that observation to predict where the market will be years from now, so they can set up their design cycle and roadmap for new products. Then they design to that cycle (try to beat it slightly, but pretty close to that cycle). Therefore, the products they make meet that cycle. I often wonder what, without Moore’s Law being well-known, the industry would be doing. Would the technology be more advanced, or less? I have no idea. It’s not really an answerable question.
People are doing a pretty good job here. One clarification though; what Gordon Moore really said was that the number of transistors that could be produced for a given cost doubled every 18 months. This was accomplished by shrinking the feature size of the transistors. Coincedentally, the smaller transistors also could run faster and this is what is popularly thought of as Moore’s law.
Processors designed for supercomputers often have better floating point support, deeper pipelines, and more memory bandwidth. But in the end, as many have pointed out, connecting a bunch of them together with a high speed interconnect is the way to get really high-end performance. Many sites connect older desktop computers together to create low-cost supercomputers.
Others have answered this pretty well, but I thought I’d add a little more detail.
Many modern compilers (Gcc is a good example) are designed as a two-stage assembly line. There is front end, which translates a particular source language into a common intermediate language. Then there is a back end, which translates the intermediate language into machine language for a particular CPU. The various front ends and back ends operate independently of each other, and ideally are designed to be unaware of the others’ existence.
Thus, if you have M languages and N target architecture, you need only write M + N software components instead of M x N. This is a huge savings, at least when M and N grow beyond 2.
Texan pretty much covered it, but as for gate oxide thickness, last time i checked the latest intel experimental process had this thickness down to under 5 atoms. that kind of gives you an idea how close we are to the point where Moore’s law with current technology will expire.
pipelining : i think it needs better elaboration. what it means is starting to execute an instruction before the execution of previous instruction has been completed. as an example think of breaking an instruction into first part and second part. you would have separate hardware to perform first part and the second part. as soon as the first-part hardware has finished its part it can start on the next instruction even though the instruction on the whole has not been completed, only half of it has been completed. in reality of course more coplicated than that, but thats the idea, and currently pipelines have grown to about length of 30 stages i think, not 2 stages as in our example.
First of all, the last paragraph of Padeye’s post is flat out wrong and then some. There are two common types of numbers inside computers: integers like 174 and floating point like 832.1098. The former takes far less hardware and time to perform operations on than the later. Inside a cpu, all operations are done in binary. There are no decimal arithmetic units inside any CPU any of us are using. A program coverts any decimal numbers typed in by a human to binary once, does all the math and then coverts them to decimal once if a human needs to read them. You don’t need a floating point unit at all to do either conversion quickly.
I’m glad someone posted a correction to Moore’s law: it’s doubling of transistors only. Also, some say 18 months and others say 2 years. It’s varied generally in that range.
Note that improved CPU chip manufacturing is a win-win-win situation. Make the “features” smaller, e.g., smaller transistors and thinner connections. Then you can pack in more stuff into a given unit of area. The transistors use less electricity (which keeps heat down/transistor but not nec. per unit area), so they can change state faster. That plus shorter connections mean you can run the clock faster. And once you your chip fab line up and running, you can start churning out chips that have a lot more bang per buck.
As mentioned, you can’t keep halving feature size much longer. When you get down to 5-10 atom width features, Old Man Thermodynamics will muck things up. Forget DNA, quantum and all that, that won’t help here. An atom is an atom. Ditto on clock speeds. You can shove electrons from A to B only so fast.
But there will be more powerful computers still. They will just have a lot of big chips or whatever. Maybe a jar of DNA molecules. I.e., still plenty of room for things with lots of components. Parallel computing is the way of the future, unfortunately, things have not done as well as we’ve hoped there over the decades. (And I’ve had a front row seat, if not actually giving the talk.) Most people don’t think parallel. Intel and MS have done a lousy job moving in this direction. But hopefully some other companies will take off in the right direction soon.
Fun note: As the number of transistors has increased in CPUs, the percentage of them in use (being switched) at any given time has shrunk to a very, very small amount. If a CPU chip maker just uses the already existing transistors more effectively, they will kill the competition. Pipelined multi-threading RISC is one obvious way of doing this. But Intel will just make the instruction look-ahead longer and not understand why their newest chip designs are slower than their old ones.
More explicitly, the coprocessor units that Padeye mentioned handle floating-point math, not decimal math.
However, there is a way that processors can handle decimal numbers directly, called BCD, for binary coded decimal. Instructions for handling them, i.e., doing math with them, exists in many CPU architectures including the Intel instruction set. They are almost never used except for systems that are running ancient business software, though.
I like to use the example of an automobile assembly plant. Think of each automobile as an instruction for the computer. You can completely build a single automobile before starting to build the next automobile. This is the way the auto industry started to do things in the early days. It is also how many simple computers operate. They fetch an instruction from memory, execute the instruction, fetch an instruction from memory, execute the instruction, etc. Many modern computers operate more like an automobile assembly line. Instead of “execute the instruction” being a single step, the break it down into multiple sub-steps. With each tick of the processor’s clock, a new instruction is fetched from memory and placed in the processor’s “assembly line”. It then travels through a series of stations where one sub-step is completed at each station. Each clock tick advances the instruction to the next station. By the time it reaches the end of the “assembly line”, it has been completely executed. While it may not be able to execute a single instruction faster than a non-pipelined processor, it has the major advantage of executing more instructions per second, due to the parallelism introduced by the pipeline. This is the same principle that allows an automobile assembly line to produce hundreds of cars in a day, even if it takes 12 hours for a car to travel from the start to the end of the assembly line.
To make a computer work faster it needs more transistors. These are little electronic switches, which make up the logic gates a computer needs to work with zeros and ones. A computer compiles the programming language into machine code which is just 0’s and 1’s, which are all it can work with. There is no “limit” on how fast PC’s can theoretically go, but there are physical limits on the size of transistors. This doesn’t necessarily mean that computers will stop getting faster when this limit is reached; new ways of design will most likely be found. At present, Moore’s law has been guaranteed for the next 10 years at least.
Actually, although a lot of the speed of chips these days is in the transistors, it is also quite common now for the speed of certain sections to be limited by the interconnect. That is, the tight part of timing on some circuits may be due to the signal travelling from one transistor to the next. The limit is not (yet) the speed of light (or even the speed of E-M propogation in that line), but is determined by the RC constant of the interconnect. This is a problem, the more transistors you put on a chip.
Our maximum chip size is remaining fairly constant, which means those lines aren’t getting any shorter. Just the opposite; we’re stacking more and more layers of interconnect (think: wiring between transistors) on each chip.
Different semiconductor materials are naturally able to process data at higher speeds. You are probably familiar with elemental silicon (Si) as a substrate. Your Intel Pentium (tm and all that) chip is made, basically, out of silicon. Silicon has many benefits: it is relatively cheap and forms a natural oxide (SiO[sub]2[/sub]–glass, essentially), which is quite useful in making circuits.
But there are other many other materials available for different purposes. For example, gallium arsenide (GaAs) is naturally faster than Si and is used in communications devices were high speed is important. Indium phosphide (InP) is another–very fast but very expensive. A square meter of Si costs about $2k, GaAs costs about $5k, and InP costs about $30k! Wafer size is also important, since it is much more efficient to process larger sizes. Si is available in 300 mm, 8" is max for GaAs, and 2" typical for InP.
There are many other substrates available for the bewildering array of devices out there. For example, lithium tantalate (LiTaO[sub]3[/sub]) for optical devices, etc. etc. It ain’t just silicon anymore!
And systems that use Oracle, which is a very widely used RDBMS system. Oracle stores and manipulates all numbers in a variable-length BCD format:
SQL> select dump(0) from dual;
DUMP(0)
----------------
Typ=2 Len=1: 128
SQL> select dump(31) from dual;
DUMP(31)
-------------------
Typ=2 Len=2: 193,32
SQL> select dump(3141) from dual;
DUMP(3141)
----------------------
Typ=2 Len=3: 194,32,42
Like Punoqllads said
Si