Hot CPUs and Thermal Noise

I was checking my Nvidia System Monitor and it showed my CPU was running at 140 F and the GPU was running at 185 F. This seems to be a normal operating temperature for processors these days. What I want to know is, does this high temperature compromise computational accuracy by creating thermal noise in the circuits?

Many modern processors run at significantly cooler temperatures than that. Processors that run that hot don’t tend to last as long. Manufacturers are counting on the fact that most folks upgrade before long term reliability becomes an issue. But, that’s not the issue you asked about.

As far as thermal noise goes, no, it does not affect a modern processor’s accuracy. Processors use digital circuits, which are by their nature very immune to noise. The signals in a processor are either “off” or “on”, which translates to either a low level voltage or a high level voltage. In order to get an error in the processor, you have to have so much noise that the signal is read as being in the opposite state.

Many modern processors also monitor their own temperature. Instead of overheating to the point where the circuits do start to misbehave, once a heat problem is detected the processor slows down so that you don’t get errors. Instead of overheating and bursting into flames, modern processors just slow to a crawl.

Wow, that’s way too hot for anything recent. What are your CPU’s make and model?

140f=60c and that’s on the edge of acceptable. 185f=85c and is way over the top, you are risking permanent damage to your video card. Probably you have a fan that’s not spinning, heatsink not set on it properly, or a blocked airvent in the case (either by dust or physical objects such as books or furniture).

Bear in mind that thermal noise power is proportional to the absolute temperature (degrees above absolute zero) and that room temperature is about 535 degrees Fahrenheit above absolute zero. A temperature rise of 50 degress above a nominal operating temperature might be enough to seriously compromise the longevity of the chip, but it would only increase the noise power by 10% and the noise voltage by 5%. As pointed out above, the chips are digital and the separation between levels is designed to overwhelm the thermal noise, so a 5% increase is not the thing you need to worry about.

Ha! Yeah I was thinking in C and not F. 60 is fine while on load.

Sounds too high to me unless it was measured after running max load for several minutes.

My temps:

CPU (Intel Q8400 quad core): Between 32C-39C (90F-102F)
GPU (NVidia GT210): 47C (117)

Under heavy load I see CPU temps in the mid 50’s, GPU in the high 60’s (Celsius).

I’m running BOINC and Folding@Home using both the CPU and GPU at full capacity in the background. The computer goes into sleep mode after 1 hour of inactivity, so they get a chance to cool down then.

CPU is an Intel quad-core Q8200 — runs at 60 C at 100% load.
GPU is a factory-overclocked Nvidia GTX 285 — runs at 83 C at 76% load.

In that case you’re fine. Those temps are normal for high load.

FYI - For semiconductor chips, the reliability numbers start to turn south around 45 deg C. Long term reliability analysis isn’t a simple thing, but the old “quick and dirty” engineer’s rule of thumb is every 10 deg C cuts the expected life of the device in half.

A lot of computer stuff tends to run hot these days. Computers used to be designed to last about 8 to 10 years. These days they skimp on the cooling, or put such high performance chips in that they just can’t get them cool enough. They figure you’ll upgrade after a few years so it doesn’t really matter much anyway.

A lot of folks say that numbers like that are ok, but you’ll do better in the long run if you can get the temperatures down a bit.

Thats still pretty rough for high load. I bet an extra fan in your case blowing air out the rear would help. While CPUs and GPUs can take the heat, your hard drives cant. Id check the temps on those when youre at 100% load. If you have poor ventilation then your ambient temperature will be high and that could spell an early HD death.

The video card, at least, cannot be further cooled. It is a dual-width card in its own sealed enclosure. The card’s own fan blows air into the enclosure and it comes out a vent in the back of the computer.

The hard drive temp reads at 49 C after the computer has been on for a few hours.

I have found that dust is a problem for video cards. The heat sink fan enclosure system gets clogged with dust. Blowing out this dust with one of those canned air thingies does wonders.

That used to be the case, but today’s CPUs run on much lower voltages than they used to, for speed and power reasons, so noise immunity can be a problem.

I’ve never heard about a problem with thermal noise, but other types of noise can be issues. If you have a bunch of internal wires running in parallel, have n -1 of them do a 1 -> 0 transition and have the other do a 0 -> 1, you are going to see that 0 -> 1 transition delayed, perhaps enough for the wrong value to be written into the destination flop. People were worrying about this 10 years ago, but we’re finally seeing it today.

BTW, we qualify chips at high temperatures, so if there was a big thermal noise problem we’d probably see it before you did. I’m not a reliability person, but I know junction temperature is considered in computing reliability.

Take the cover off the case and see how much the temps change. If you see a big change (more than 3-4C) you should add another case fan. I prefer intake rather than exhaust fans as you can put a filter on it. A bunch of exhaust fans will put a slight vacuum on the case and dusty air gets in.

Huh. Until now I thought “thermal noise” was that crackly sound you get when you take long-johns off too fast.

My Nvidia 8800 GT with stock cooling is at about 190 degrees F at insignificant load. The case has good airflow and no dust clogging things up. The fan runs at a very low RPM until it heats up to over 200 F. 212 F seems to be the limit where the fan starts running at full speed. Under load it stays a little bit below 212 F. This seemed very high to me, but it should crank up the fan RPM when it gets too hot by itself. Normally it just runs the fan at 35%.

My point was about thermal noise, not other sources of noise. kT/e is aobut 25 milliVolts. 5% of that is about a milliVolt. Even with today’s lower supply voltages, the margins had better be large enough that a one milliVolt increase in thermal noise isn’t going to create bit errors.

On the other hand, there are other thermally activated processes that can lead to circuit noise. For example, leakage current typically depends exponentially on temperature. The shot noise due to that current could rise by orders of magnitude over a 50 degree temperature increase. It’s not thermal noise, but rather a non thermal source that is strongly temperature dependent.