Computer hardware question about CPUs and heat sinks

depending on what preset you use, Prime95 may not stress much memory. if one or more of your DIMMs has a flaw at a particular location, P95 might not ever be trying to read or write to that region. other programs (or the OS) could be. Do run Memtest86+ overnight on your system and see what happens. it’s not foolproof but IME it’s very good at telling you if you have a problem.

Almost never is new computer equipment up-to-date right out of the box. By the time they manufacture it, ship it across an ocean, distribute to warehouses then distribute to retailers who may have it on a shelf for weeks or months means most everything you buy for your computer will have out-of-date software drivers.

Indeed when I install something new I use the provided drivers and then immediately update. I cannot think of a time where there was not a new driver already out.

You never know but definitely worth checking. Only takes a minute to find out.

I booted the CD and it came up and immediately started testing. I assume it’s using default settings. Should I let it run or do I need to configure it?

default is fine, just let it run.

It’s running. Is there some end point where it indicates that it’s finished, or do I just stop after a day or so.

By the way, THANK YOU to you and everyone else who’s been so helpful in this thread. I hope that I’m able to repay the favor one day.

The SDMB is amazing. It’s not even a technical board, but no tech board I’ve seen gets such a volume of useful responses so quickly.

Okay I read up on Memtest. It runs indefinitely and you stop it when you want.

Right now it’s on pass #15 with no errors. With the frequent problems I’ve been having it seems likely would have found something by now.

So it seems like it’s down to the CPU or the power supply. I suppose the video card or the audio card are also possibilties.

This all started when it was acting flaky about starting up and then eventually failed to boot at all. This was the point at which I replaced the motherboard.

Fixing the motherboard got it to where it would boot again. So presumably there was a mother board problem and replacing it corrected it.

So I see a few possibilities.

[ul]
[li]There is some other issue totally coincident with the mother board problem. This seems unlikely.[/li][li]There was another issue to start with and that issue damaged the original mother board.[/li][li]Maybe the problem started with the original motherboard and that motherboard problem damaged something else.[/li][li]The issue was never with the motherboard but something about the new one (newer bios maybe?) is causing it to be slightly more tolerant of the real issue.[/li][li]Maybe my ignorance regarding thermal compound damaged the CPU.[/li][li]Maybe I somehow damaged something else when I installed the new motherboard.[/li][/ul]

I had another shutdown last night followed by a “thermal event” message when I turned it back on.

The shutdown was proceeded by a total freeze where it wouldn’t take any input and the mouse cursor wouldn’t move.

I turned it off when I saw the thermal message and let it sit a few minutes then tried to boot it.

The power light came on, the disk activity light flickered a few times then nothing. I tried several times, waiting a few minutes between tries. During these tries I was unable to shut it off with the power button and had to pull out the power cord instead.

Finally it did something different at bootup and beeped three times (memory error) before hanging, so I turned it off and removed everything but the bottom two gigs.

It still hung up on booting a few times, but with no beeps. Finally it successfully booted into Windows again. It ran a while and then hung up again so removing the memory didn’t completely fix it and presumably the removed memory is good (it did pass the memtest86+ tests).

Here’s a theory. I’m seeing behavior similar to what I saw before I replaced the MB, except that I didn’t get the thermal even messages before. Maybe the problem has been thermal all along. Maybe the original MB just didn’t report it and the newer MB is better at detecting and/or reporting thermal problems; maybe because of a newer BIOS.

Maybe the old MB wasn’t really bad, but I just didn’t give things time to cool down before giving up on it.

The one thing that I can’t explain with that theory is the behavior of the power light with the old MB. The power button has a bright blue LED which lights up when there’s power, When the machine started giving me problems booting and before it died completely (or at least I thought it had) that blue LED wasn’t lighting or was very dim even when it was on and seemed to be running normally. Since replacing the MB it’s always been very bright.

So, why did replacing the MB fix the LED issue but not the other issues? I’ve been wondering if the LED may indicate a power supply issue, but if there wasn’t enough juice to power the LED, I would think that there wouldn’t be enough to boot up and run the computer.

I keep thinking that there must be one point of failure (presumably other than the MB) that’s affecting, and possibly damaging, other things and causing all the different issues I’m seeing.

Still getting thermal shutdowns.

Suggestions? Should I reapply the thermal compound?

Remember too much thermal compound is not good. You want it as this as possible to still do the job.

And again, getting air into the case and back out is as important as the heat sink. The heat sink cannot do its job well if there is hot air in the case.

Pay attention to air flow in the case. Both intake and exhaust fans. Visualize how the air moves in the case. You want good air flow past the CPU and video card. Remember, the video card can produce substantial heat too (depending what you are doing).

Also, clean up cables. If it is a rat’s nest of cables in there that hinders good air flow. Do some cable management and clean things up. Clean off lint and dust too. Dust on the heatsink diminishes its effectiveness (and affects the fans).

The airflow is the same as it’s been for years; plus I added a PCI fan. So, if anything, there is more airflow.

The cables are a possibility, since I did replace the MB they’re not necessarily arranged as they originally were. However, I haven’t put the side back on the case yet. Shouldn’t the side being open help?

I cleaned out the dust when I replaced the MB.

Sometimes the fans can’t do their jobs with the case side off.
They could wind up re-circulating warm air rather than getting the cold in and the hot out.
In your shoes, I would certainly see what happenned with and without the case fan.
I’d also buy some hardware (or even use some tape) that I would use to re-arrange the cables to maximize airflow.

On edit:
Also, just for kicks, I might take a powerful box fan and put it right up against the open side of the case. I’m talking about a fan that could provide stupid-high amounts of airflow. If you still have problems with that much airflow going across your board, then you’ve developed a defect in your mobo or cpu.

Actually having the side open can hurt.

A closed case should make a river of air flow past the CPU. When it is open this may not happen.

Depends but open case is not necessarily a cooler case.

Try pointing a floor fan at it with the case open (from some distance away as fan motors put out a fair bit of EM interference). See if that helps. It should at least tell you if the problem is air flow or a poorly mounted heat sink.

Personally I use tension screws on my heat sinks (generally a feature of higher end heat sinks) which means the pressure on the CPU is dummy proof.

Doing it without those is harder. The heat sink needs to be on pretty firmly against the CPU to work but not so firmly you crack the CPU.

As I said before, The heat sink has tension screws. It screws into the motherboard and there’d be no way to keep it on without them.

The idea that the open case could actually be a problem is interesting. It seems to be the one factor that’s been there through all of this.

This all started when I installed a new hard drive. I wasn’t having these issues before that. After putting in the new drive, I hadn’t bothered to put the side back on. I wanted to see how it worked before putting it back on and then I just got lazy and didn’t bother.

So all of the problems have occurred since I’ve been running with the side off. That’s the one constant factor.

I’ve been running with the case closed and the cables as out of the way as is possible.

I’m still having problems. Sometimes it shuts down completely and on reboot tells me that a thermal event occurred. Other times it just totally freezes, won’t accept keyboard or mouse input, and has to be shut off via the power button. Sometimes the power button won’t even shut it off and I have to pull the plug.

I’ve replaced the mother board, the memory passed hours of testing with memtest86+, I’ve applied thermal paste properly (at least I think I did it properly), I’ve arranged the cables, I’ve cleaned out any dust, and I’ve run with the case both open and closed.

None of that has solved anything. A couple of times I thought I’d solved it but the problems eventually returned.

So I think it’s down to the CPU, the power supply, or possibly the graphics or audio cards. Since I’m getting thermal events, it seems like it must be the CPU.

So I guess the next step is to replace the CPU. Anyone agree or disagree?

My career in PCs is 14 years old. I agree with CPU.

I originally replace the MB because it wouldn’t boot at all. Replacing it at least fixed that. Could a failing CPU have damaged the original MB? Could a failing MB have damaged the CPU?

My theory at this point is that the original problem really was a motherboard failure and when I ran the new motherboard without proper thermal protection I damaged the CPU. :smack:

Your question 1: Yes, absolutely. Hot CPU burns mobo with 200 degrees, sure.
Question 2: possibly. If the regulators on the mobo fed the wrong volts to the CPU, it could happen.

Your theory: Plausible.

One argument against your theory is that most modern CPU/mobo combos have design features that should keep thermal fails from breaking stuff.
I would go ahead with your plan.

Can I put in a faster / more powerful processor than the old one? If so, how can I tell what would be compatible with my motherboard?

The best bet is to check the docs for your motherboard.

From what I can see, it already has the most powerful one listed in the specs. It’s a dual core Intel Pentium D 820 Processor.

Since it’s an older MB I guess I was wondering if there a faster version of the processor that will work on the board. Probably not, I guess.