What's the Biggest Integrated Circuit that Could Be Made?

You hear a lot about Moore’s Law and how we may be reaching its limits, i.e. that the day is not far off when conventional processes will be unable to shrink circuit elements any smaller and the density of CPUs and memory chips will be “maxed out”.

But why can’t we just build physically bigger chips that could hold more circuits? Why couldn’t you make a chip with say, a hundred times the area of current chips? That’d still be small enough to fit on a desktop, and BAM! Two orders of magnitude of improvement in computing power. You’d have the equivalent of a hundred PCs, hard wired on the same chip.

What are the limiting factors at play here? Heat dissipation? Power consumption? Availability of large silicon wafers free of flaws? Is it still simply cheaper to shrink the circuits instead of make the chips bigger?

I think one issue would be signal propagation times. With a gigahertz clock, a signal travels a surprisingly-short distance in each clock cycle: on the order of a foot IIRC. So you might run into problems building a foot-wide IC…

Yeah, at extremely high frequencies, even a significant fraction of the speed of light is too slow a propagation if your circuits become large. If we didn’t have that limitation, there would be no need to make CPUs so small, and the heat dissipation problems would be much easier to solve as well.

But we are limited by the fundamentally maximum speed at which a signal can move, so in order to continue improving speed, designs have to focus on reducing the amount of distance the CPU has to wait for any given signal to propagate in one cycle. This means reducing size or increasing parallelism. Reducing size leads to greater heat problems, and the diminishing returns of that approach is one reason parallelism is becoming so important.

IIRC*, Clive Sinclair was prepared to offer us solid state hard drives back in the 89’s based on wafer scale integration of nonvolatile memory.
Nothing seems to have come of it.


*and I may well not, it could easily have been one of the other movers and shakers of the era.

The modular parallelism referenced by Secret Spud has already effectively solved this problem. Making a “big” non modular IC yields no benefits and introduces lots of problems.

That’s not really the case.
The drive to make ICs smaller is really only two things: expense and speed. Propagation delay is an issue, but it isn’t the limiting factor in CPU speed - transistor switching speed is. But, even if there is no need to make an IC any faster, there will always be a driver to making it smaller, and that is yield per wafer. If you can shrink your IC by 70% linearly, you can get nearly twice as many potential candidates per wafer. And, you actually yield more good dice, because there are fewer possible killer defects for a smaller die. So, you see CPUs shrinking with very little increase in clock speed, because they are cheaper to make smaller.

To answer the OP, there was a company - Wafer Scale Integration - that tried to make complete systems using an entire wafer (3" as I recall). The company went belly up, because they were never able to do it economically, even though they had some early successes.

Ah, that’s a good point. I wasn’t thinking about manufacturing costs and yield sizes. Makes much more sense that way actually. :stuck_out_tongue:

There are a lot of inter-playing issues here.

Propagation of signals is a very difficult one. It is true that the velocity of light is about one foot per nanosecond in a a vacuum. However this isn’t the issue with chip design. When sending a signal you are concerned with the time it takes the signal to rise (or fall) to the point it switches the receiver’s logic. At chip scales a conductor behave, not as a wire, but as a delay line. Resistance is significant in the conductor, and depend upon the material the conductor is made from (metal is good, polysilicon is dreadful) and the conductor is capacitively coupled to its surroundings. The effective propagation speed of signals, especially in very small feature size chips is astonishingly slow.

However, the same problems exist coupling chips together with wires. Once you drop off the die, the signalling speeds are a couple of orders of magnitude slower. The latency to perform a transaction off a chip is a significant limitation to speed. One of the reasons the current generation of processors is moving the memory controller onto the die. Moving caches onto the die was one of the key enablers of the current processor performance.

As he feature size shrinks, the clock speeds increase. One of the reasons we saw ever increasing clock speeds. However as the feature size falls, the signal propagation speed falls even faster. There comes a time when the potential to increase clock speed can’t be met due to the falling signal speed. The signal speed (very roughly) is inversely proportional to the resistance, and the resistance increases as the square of the shrink factor. So if you halve the feature size, your signal propagation time may quadruple, and therefore the distance across your design the signal can propagate actually falls. Clearly there is a crossover point. A point where chips actually can’t clock faster, even if they shrink. And guess what? We crossed that point a couple of years ago. (This is all very rough, there is devil in the details.)

So why keep shrinking? Well you get more on a chip. And you do still keep making speed gains. The individual processors are not much faster, and we are at the point where it is hard to throw more logic at a processor design to make it faster. One, we are running out of ideas, and two, the propagation time to cross all that extra logic slows the chip down, and so you don’t win. But coupling the processors to lots of fast cache on the chip works wonders, and ever more nearby cache is one of the current reasons we still see increasing performance out of modern processors. Plus keeping all the other logic nearby, bus controllers, memory controllers, and more processors. A parallel algorithm is often limited by the speed at which the parallel components can communicate (even it is to do nothing more than synchronise.) Going off chip to do this is very expensive. If your peer processors are on the same chip you are an order of magnitude better off. If you share cache, even more so.

The big limit to the physical size of a chip is the yield rate. This is something that has been a constant for decades, indeed even going back to the days of discrete transistors. As you increase the size of the chip the probability of including a defect grows very quickly. There comes a point where you simply never get a working chip. It isn’t just defects in the wafer, but the chance of a tiny imperfection in any part of the fabrication process. The actual sizes of chips sold has changed almost not at all over the last couple of decades. Essentially you get small, very cheap processors, used for embedded control and the like, a few millimetres across. You get consumer level processors that go into most people’s PCs and laptops, about 10mm across, and you get huge very expensive processors at the top end of the ecosystem, up to 2cm across. Costing a few dollars, a hundred or so dollars, a thousand dollars or more.

On the other hand, how big a chip could we make? Or do we make. Current mainstream fabrication is on 300mm diameter wafers. Could we make a chip, one per wafer? One thing we do do is make CCD sensors nearly that big. They are bespoke items, and it is hard to put a price on them. But for astronomers, and probably for some security applications, they are great. As a production item you could think of Kodak’s KAF-50100 CCD, which is is 49.1x36.8mm and delivers 50 megapixels.

Two decades ago the Sematech consortium started out, and one of its intial goals was wafer scale integration. So the idea isn’t new. But for the above, and many more reasons, the idea didn’t work out. Sematech still exits.

What about going into three dimensions? That lets you get more transistors without increasing the distances involved. I read a while back that they were working on chips that had two separate chips stacked close together with a bunch of connections between them. I imagine an actual 3-dimensional chip wouldn’t work because it would generate too much heat.

I’d say that there are three main issues:

  1. There is always a certain number of flaws on a wafer, so the chances of having 100 or 1,000 perfect “modules” on a wafer are smaller than you’d want. You can solve this to some degree by having spare modules and “fuses” that you can blow with a laser so you can re-configure it.

  2. Memory needs a different process than CPUs and other logic so you can’t economically put them on the same wafer.

  3. If you need to have memory and other parts of the computer off the wafer then you already have to deal with multiple chips and won’t gain a great deal by having the CPUs all on one wafer.

On the other hand, I wouldn’t be surprised to hear that the NSA or other agencies have some specialized HW made from wafer level integration that is an array of special purpose processors.

Multi-chip modules have been pretty common for a while. Some are vertically stacked, some aren’t.

Stacking modules has some advantages, since you can reduce the interconnection distances, and thus improves inter-chip signallng speed. At the expense of significantly more messy packaging technology. One trick is the equivalent of a ball grid array, where the chips are stacked face to face, and the interconnects are done by small bumps that touch. But that only works two high, and in order to be able to gain access to lands for off chip connections one chip must be smaller than the other. TI have a patent on some ideas here. Intel have shipped two chip modules for a long time. The Pentium Pro was two chips - processor plus cache, and often their top end multi-processor offering is actually two chips in a package, with some inter-chip connections.

The different processes needed for DRAM versus ordinay logic is a real issue, and why DRAM won’t be finding its way onto mainstream processors anytime soon. IBM have a technology for putting DRAM on the same die, and have used it to good effect in things like BlueGene. But the density is nothing like that achievable in a single purpose DRAM chip. However getting it very close to the processor has proven to be a big win in specialised applications. In many ways it becomes another layer of cache.

We are very close to seeing mainstream PCs being made from a single big chip plus memory. Expect to see it in a netbook in about a year. (Well nearly, the RF chips - for WiFi, Bluetooth, 3G, won’t be integrated, they also typically need specialied processes. Plus the mundane bits like power control, battery managment, buffering to the outside world, etc won’t be either. But I/O control, Graphics, Ethernet, USB, that will.) The big win was putting the memory controller on the processor die, something AMD have been doing for a while, and Intel’s latest offerings do. The memory controller can become much faster and much smarter, with deeper buffers, much better ability to manage multiple overlapping writes and read modify write cycles. Big wins here.

One of the other issues that Sematech found when they were trying out wafer scale designs was differential heating problems. It was very difficult to manage a wafer with alternating cool bits and seriously hot bits. Wafers crack.

They’ve got 32mb on chip L3 dram in the Power7. That’s a pretty nice size on-chip cache.

Yes and no. You cram more electricity and heat waste into a smaller area, but as your process size gets smaller you can reduce the voltage needed to make the jumps between points, and that voltage is the biggest factor in waste heat. In my experience building desktop systems, my original athlon thunderbird was like a small fusion reactor under my 3 pound copper heatsink, and subsequent processors have been easier to cool.

This question has been pretty much answered, but the reason is economic more than anything. A single flaw in any part of the manufacturing process can render the result unusable. Occasionally chips are designed to withstand certain flaws by having certain features be disabled - for example, many times the high end and mid end versions of a video card are the same hardware, but the mid-end one had flaws in one or two of the units and they’re disabled, which makes it perform at less than ideal speeds but leaves it still functional. AMD also releases tri-core CPUs which are quads in which there was a flaw in one of the cores and it’s disabled. The smaller you can get the process, the less likely there is to be a defect within that particular wafer, and your yield goes up, and total costs go down.

As the proces feature size drops the amount of energy lost to leakage goes up significantly. This is now becoming a very significant problem. One of the most important recent aspects of power reduction has been actively shutting down parts of the processor that are not in use. Intel have lagged behind a bit in this, but are catching up. Some chips power down internal units to a very fine granualrity, and can power them up again in not much more than a clock tick. The power draw of a modern processor running flat out doing real computation versus one just hanging about is quite significant. We had a dramatic way of demonstrating this behind one of our cluster racks. (It held 544 cores.) When idle, the air flow out was warmish, when the chemists got to it, it was like standing behind a stack of fan heaters. A couple of generations ago processors had at best rudimentary power managment, and would always run pretty hot.

Another thing that is common with processor chips is to design the level 2 (and level 3 if there is one) into banks. A flaw in a bank simply results in a chip with less available cache. When you buy cheaper processors with less cache, you might guess what they are. The Cell processor used in the PS3 is also marketed as having 7 cores. With one possibly defective.

It is likely that this will change the economics of chip area reasonably soon. Then the other issues, like managing the thermals, will become limiting.

Why is this?

Basically because everything gets thinner. In particular, the insulating layers get thinner. But everything is getting to the point where the physics is getting weird too. More electrons can tunnel across boundaries, that in larger feature sizes only a few could manage. Layers are down to the point where you can count the thickness in atoms. At those scales everything leaks. So you find yourself with quite a significant amount of the input current from the power simply leaking straight through the the chip.

That said, there are ways to attack this problem. Examples include SOI - silicon on insulator, and high-k dielectrics. They stave the problem off a couple of generations, but we have moved from a time when leakage was really not a problem, to where it sit in centre stage, and gets worse.

For CCDs, if there are a couple of flaws, does that ruin the chip, or do you just get missing pixels (or lines of pixels) that you interpolate over?