What are the main components of computer chip costs?

So, multi-core CPUs are useful when several non-parallelizable tasks must be performed at the same time?

Perhaps although that seems unlikely. My 350USD GPU has 1664 cores which is 400 times as many as the number you would find in a CPU of the same caliber. Also, GPU core numbers increase at a higher rate than CPU ones. By the time CPUs have hundreds of cores, GPUs will have tens/hundreds of thousands of cores.

GPU and CPU also tend to require a different memory setup for best performance. CPU memory benefits most from low latency. GPU memory benefits most from high throughput.

“Dopers” discussing semiconductor manufacture is the best generic-forum-member-description / thread topic combo ever.

We like to overclock our transistors as much as our neurons.

Well I guess they may be asymmetrical cores with some optimised for computation latency and some optimised for throughput, but Intel is definitely aiming to try to make the dedicated GPU unnecessary. Here is an article detailing the GPU’s built into Intel’s SkyLake architecture.

A GPU typically has a couple limitations/focus for efficiency:
1 - Memory access
They are not designed for parallel random read/write, they are designed to efficiently deal with (in general) large blocks of sequential read/write. There is some capability to break out of this but (in general), the more you do that the less efficient the processing.

2 - Independent parallel logic
The processing units in a GPU are not like completely stand alone cores that a cpu has. They are designed for highly parallel work that is (in general) sharing the same logic path but operating on different sets of data. As individual processing units branch and attempt to perform their own independent logic, you decrease the efficiency they were designed for.
One scenario that would perform better on a multi-core cpu than a gpu would be one that had the following attributes:
1 - Lots of random read/write - meaning not block/sequential processing
2 - Independent logic for each parallel thread/process (edit: independent logic can still mean running the exact same program/code, it’s just that the data conditions causes it to perform the different steps at different times so the processing isn’t in sync)

Yup, yup and yup. The tradeoff is between keeping things cheap by staying at an existing technology level, or retooling to allow a smaller feature size. If the number of defects remains constant, then smaller transistors mean more chips on a wafer that has the same number of defects, so more good die per wafer …and this is a good time to add features. Of course, designing new chips, buying and implementing new equipment and making the whole mess work together is pretty expensive in its own right, and going to a new technology may not be worth it.

There was a possible theroetical problem with “binning” in “the old days” – I wonder if it ever manifested itself as a real problem.

With only one bin, parts rated at, for example, 50 nanoseconds will fall on some normal curve, where most of the chips can peform at 45 nanoseconds, and only a small portion will actually fail at 49 nanoseconds. If there’s some tiny change, faulty air conditioning, or a minor change in some co-dependent timing, only a small portion of the 50-nanosecond parts will fail.

But if all the fast parts have been sent to a different bin, the chips in the 50-nanosecond bin are all barely passing and many or most of them will fail in a slightly degraded environment. Instead of a few correctible errors, systems may collapse.

Is this ever a real concern? Or was it just paranoia leaking out when sleep-deprived engineers told each other their nightmares? :eek:

Thanks for the informative answer.
So, bringing it back to Intel trying to take over consumer-level graphics processing: I don’t mean to come off as dismissive of the idea. I recently asked someone here who works for Nvidia (who can identify himself if he wishes) what he thought of Intel trying to compete in gaming graphics. I do think it would be interesting to give Nvidia some competition because AMD hasn’t been making wise decisions.

The optimizations you mention seem like they would favor having a GPGPU in addition to a CPU rather than hundreds of CPU cores as Coremelt suggested. Do you think the idea of having 100s of CPU cores will overtake GPUs?

Speculatively:
Perhaps Intel could offer a product with a small number of high freq generalist cores at one end of the spectrum, 1000s of specialized low freq cores for the simpler steps of graphics processing at the other end of the spectrum and in-between those two poles, different numbers of cores optimized in architecture/number/freq for various tasks.

Or Intel could just have two types of cores shipping on its chips, I guess.

If that risks occurring, you can simply certify them for less. If you’re afraid of the chips in the 50ns bin barely passing, you can sell them as 55ns chips.

I think my GPU serves as an example: It’s advertised as having a boost clock of 1304MHz but will go to 1392MHz on its own without any overclocking or modifications whatsoever on my part.

A video on it, most relevant part starts at 3:00: https://www.youtube.com/watch?v=HwHl3nSzz-s

That’s a definitely real problem - and worse than you state. Speeds vary with conditions, voltage, temperature, etc., and testers are not perfect. So you guardband your parts. If you are selling something that runs at 2.5 GHz you make damn sure it runs on your tester at 2.75 GHz. (For example - not real numbers.)
Even worse is that the performance of a chip in a system is not going to be the same as performance on your IC tester. System-tester correlation is the process where you figure out what speed you need to make the chip run at speed within the system.

Then you have other factors, like burn-in slowing down the part, so you need to test to a higher frequency still.
There are so many things to bite you - that’s why processor design teams are numbered in the hundreds.

One big thing that RaftPeople missed: fixed-function logic on the GPU.

CPUs don’t have specialized logic for rasterization, blending, colorspace conversion, texture filtering, anti-aliasing, color/depth compression, and a host of other things. In some cases, these fixed-function units are 100x more efficient than doing the same operations on a CPU-like core.

Intel more-or-less did what you suggest with their Larrabee project*, which was a failure despite having people like Michael Abrash around to optimize the microcode. Dedicated silicon will always be faster.

That’s not to say that a few of these fixed units might eventually be done in software, but for now, GPUs still need all the efficiency they can get. Your thousands of cores will thus need fixed-function graphics logic to be remotely competitive, and at that point you pretty much have an ordinary GPU on your hands.

  • Larrabee did have a few fixed-function units, mostly texture samplers, and was an independent chip instead of being integrated with the CPU.

Yes, but I think the essential point I was trying to make is still being missed.

Suppose that whatever guardband is provided is used up by some circumstance(s). All of the parts are still working but unforeseen difficultiers have eliminated much of the margin of safety.

Normally one would imagine the parts to fit a bell-shaped curve. If conditions further worsen slightly, a very few parts would start failing. The few customers affected can be taken care of with replacement parts. But due to the “binning” process, the parts’ behavior does not form a bell-shaped curve. Instead a large portion of the parts are on the verge of misbehavior. If conditions further worsen slightly, instead of a few outliers failing, you will see a high failure rate.

What that demonstrates is that you used too small a guard band. Your *a priori *decisions about how degraded the real world environment would be turned out to be unrealistically optimistic. So you’re actually operating waay off on one side of your expected bell curve.

The fix is to replace the defectives with higher-binned parts, adjust your decisions about real world degradation towards “worse” and make your guardbands larger for subsequent production.

And yes, this does contain the seeds of a PR nightmare. As always in any field, the true test is how your product performs in real use by real users in the real world. With all their warts and foolishness.

This is especially challenging for something like IC manufacturers where they’re separated from the final end user by 2 or 3 (or 6!) layers of subassembly engineering done by different companies. Most of which are less competent and more rushed than the IC manufacturer itself.

Component supply in any industry is not for the faint of heart.

Parts were shipped once they met some company spec’ed defect level. Back in the “old days” those defect levels were crazy high, compared to what one would expect today. But every customer buying parts realizes that 0 DPM* is only a theoretical goal. And as LSIGuy noted, you guard band as much as necessary. Often the guard bands are set looser in the beginning, and tighten up over time once the process is seen to be stable.

Defects Per Million

So, does that mean that pretty much every chip (from a reputable manufacturer) has room for overclocking and overvolting?

Any ballpark idea how many years CPUs and GPUs are designed to last if used at their stock clocks and voltages? For CPUs, I mean the industry as a whole rather than your employer of course.

“Logic” meaning the physical hardware at the micro level?
Is 100X a representative ratio for how much more effective a GPU is at graphics than a CPU of similar quality and price? I’m only asking for ballparks & trends; I know that lots of factors go into it and that it varies a lot.

Interesting, so why is Intel not creating its own more traditional GPU to compete with AMD Radeon and Nvidia? They certainly have the resources to do that if they want to. Is their some reason they couldn’t scale up the Skylake GPU architecture with more cores and VRAM and put it on a separate card?

Yep, I’m talking actual silicon here. I’m actually using a metric of “perf per watt”, which is really the limiting factor these days. If you look at a particular GPU hardware unit and determine that it can perform X operations per second on 1 watt of power, then you’d need (very roughly, as you note) a 100 watt CPU to do the same thing.

Not all components of the GU have such drastic factors; the raw shader math is going to have a smaller factor. Its the highly specialized operations like rasterization that hit these big factors.

Beats me. Their Skylake architecture isn’t super impressive, but it’s not terrible, and I’m sure if they really focused on it they could ship a viable external GPU. I guess that sometimes it’s just tough to break into a market that you aren’t very familiar with. Intel still can’t compete in the tablet and phone space, and I don’t think it’s due to x86 vs. ARM. They just don’t understand something about the market. Probably the same kind of thing with GPUs.

One thing I’ve noticed about Intel is that pretty much everything they do is designed around shipping more desktop CPUs. Integrated GPUs serve this purpose; external GPUs don’t.

One thing that holds Intel back compared to arm in tablet and phone is that arm is willing to license the ip and customers can modify it as they want, Intel isn’t willing to do that.

Yep I was also going to say that. Apple’s A9 SOC is a very different beast to the stock Cortex ARM designs. Apple has their own proprietary image processing algorithms for noise reduction for the camera implemented in hardware in a single chip and also hardware motion processing support for the gyro. Even if Intel made a cheaper faster lower power ARM instruction set cpu, big companies like Apple and Samsung wouldn’t use it if they couldn’t license it and add their own differentiating features.

IMHO, (and it is very humble) Intel lost the plot when they took the decision to sell their XScale ARM business to Marvel. The mantra before this was they Intel was primarily a fab/process company - and anything that enabled it to sell silicon was what they did. The x86 was a way of selling silicon. But they decided that the x86 was a highly valuable thing in its own right (no argument there) and that they should leverage that IP and capability much harder. Whilst there is some logic in this idea, I don’t think it has ever really worked out for them in the manner they hoped. The idea seemed to be that there was a virtuous cycle of re-enforcing value between the desktop, low power, highly parallel and server level architectures if they all ran x86 API. In the server space they have seen off Itanium, and pretty much wiped out Sparc and Power, except in the very high end. So that worked. But they have never made Atom or Larrabee/Phi work. ARM is now snapping at Intels heels in server markets. Not there yet, but there is a lot of interest. Atom is nowhere in the portable market, a market the ARM basically owns. Kids are now learning to program with tiny ARM machines, not x86.

No doubt, the manner in which x86/IA64 have wiped out all opposition in the desktop and workstation market has been impressive. In terms of GPUs, Intel will wipe out add on graphics cards for all bar gaming and virtual reality applications. HD or Iris level GPU config in a Skylake might look pathetic in terms of raw graphics grunt compared to the high end monsters from AMD and NVidia, but for almost any ordinary use, they are quite capable, and provide graphics capability you used to pay SGI a large fraction of your first born to get.

But Intel are finding themselves in a place not all that far from that that Microsoft finds themselves in. And it is partly their own doing. The desktop market is stagnating. You can get a simple tiny box that runs pretty much everything most people want very nicely for not a lot of money, and there is very little incentive to upgrade much past this.

Cloud services, and increasing capability of portable devices are the elephants in the room. These will eventually wipe out the desktop as a stand alone entity.