It’s a good idea to make a disk image backup first, in case you forget anything or later want anything from it.
I use portable versions of as much software as possible, and put them all in sub-folders under one ‘programs’ folder. That makes it very easy to back them up or move them without losing any settings.
It’s the settings and options that you don’t want to lose and have to do all over again.
My concern is that I do this and it turns out to be a hardware problem.
Would that require formatting the disk I use as my C drive?
Before you re-install be sure to save all your work to somewhere else (ideally the cloud, a thumb drive can suffice).
If you have a hardware problem then re-installing Windows only highlights the actual problem. It does not make it worse. It just shows there is an issue that needs sorting. Maybe you do not want to have to spend money and think limping along is the way to go.
I am hard pressed to think that is the best solution but if you absolutely cannot afford to fix your PC then…maybe?..limping along is the best choice. If that is the case chances are it will only get worse. Be sure to backup your files to something else. Just in case.
Given that it is NMI_HARDWARE_FAILURE, almost certainly there is a driver misbehaving, and it may be due to hardware misbehaving. While your’e updating your drivers, don’t forget any OEM drivers on the motherboard… video, sound, network, etc
Given that there’s some stuff you didn’t remove, you might take another stab at finding ways to disconnect more hardware and let it sit overnight or something.
I will say I’ve had more hardware crashes with self-built machines than store-built. I don’t know whether it’s due to quality problems, or cost-cutting, or mixing incompatible types of high-performance gear. Or it could just be my stupidity, using too much force or something like that.
Given that you mentioned that you’re ready for a rebuild, you might begin that process incrementally by replacing parts, starting with the RAM sticks, to isolate the problem. You should be able to get refunds for most things.
That’s why I suggest trying a Live CD of Linux. It allows you to switch your operating system and see if it still freezes up when left alone. It’s imperfect, because it isn’t going to be using the hardware in the same way, but it is still a pretty good way to eliminate the operating system as the issue.
Debian’s page for a bootable DVD/USB image is here: Debian -- Live install images
Money is not the issue. Reinstalling Windows and then reinstalling all my apps and then reconfiguring everything is not hours, it’s weeks, as the little moles pop up and I whack them down one by one. I just got a new phone and it took me a week just to get that the way I want it.
If I have to do all that, I am more inclined to use the nuclear option and get a new motherboard/CPU/memory and just be done with it.
Before the crash, your CPU was running at 95% at night when nothing was supposed to be running. That suggests to me that it’s a software problem.
Have you checked all background processes using Autoruns? It will show a LOT of things (including Windows processes) that other startup utilities won’t show.
Also, use process logging to find out what’s running when the CPU reaches a 90% threshold. I think the utility I linked to can do that.
I have already downloaded and installed ProcDump and have been checking into how it works. It will monitor specific processes, but it doesn’t look like it monitors the whole system. It will dump a log if a particular named process exceeds a threshold you set. Although I’m not sure what I would do with that dump file once I had it.
I installed it and ran it but the doc is a little light. I can’t figure out what the color coding means, or how to disable a process from starting on boot. I’ll slog my way through it.
See what is running.
Free and already built in to your system.
Yes, thanks, I use it all the time. I also have Process Explorer. However, they don’t log anything (that I know of).
It honestly is not weeks.
I do this for a living (part of what I do).
I understand it is a hassle but it is a few hours and you don’t even need to just sit there and do nothing as things download. You can cut the lawn and make dinner and vacuum the house while the stuff downloads and installs.
Unless you are on a dial-up connection. If your download speed sucks that bad find a friend with better bandwidth and do it at their house.
Buying a new PC will take nearly as long to get setup. You are only skipping a Windows download.
I get that this is a hassle and a chore and it will gobble up several hours of your day but it is entirely doable and you get a spiffy clean PC like the day you unboxed it and, hopefully, without spending any money on it (maybe the cost of a thumb drive if you do not have one).
Sorry, I was wrong about ProcDump - it will only monitor one process at a time. Apologies.
The correct tool to use seems to be Windows Performance Monitor, which should already be on your system.
You can create ‘Data Collector Sets’ that will log anything you choose - processes, memory, CPU usage, etc. etc. to a file of your choice, in a variety of formats, at intervals you specify.
It’s a complex tool and a bit messy to use, but it lets you monitor practically anything on the system in many different ways, and there’s no question it can do what you need.
I know it’s not a big deal to reinstall Windows. I have done it before, most recently when I built this machine. (Since the original build I have replaced the motherboard and was lucky that it came up on the existing boot drive without reinstalling Windows.)
The thing that takes time is reinstalling all my software. If I have to install to a formatted boot disk then I will have to back up my 1TB SSD C drive first. I will have to go through all my downloads and figure out what to install (vs. what is deadwood). I might even have some older stuff that has to be installed from DVDs. Microsoft Office is a pain in itself–I have custom start folders, ribbon customizations, QAT custom buttons–all of that will have to be configured and reloaded. Set up my desktop, screen saver, Start panel, taskbar. Application configuration like backup settings, default folders. And I know from experience that every time I use something first time, something will be have to be tweaked (browser plug-ins, etc.).
I will do that if I am convinced there is an 85% chance it would solve my problem. I would prefer to do as much hardware diagnosis as possible before I do that. The fact that the machine just totally freezes, rather than getting really slow or otherwise degraded, suggests to me a hardware issue. It might be related to drivers or settings rather than the physical hardware, but I’m not sure how to troubleshoot further.
Yes, that would indicate something undesirable going on, but I have seen the machine hit 100% and it doesn’t crash. It doesn’t do much of anything but it stays up. The really troublesome thing to me is that it just freezes, completely unresponsive. I have even tried leaving it alone to see if something is hung and recovers, but it stays that way (for over 8 hours, last time I let it go). I have to do a power-down restart. At least the board responds to holding the power button down, but that is probably a very primitive function that is designed to work no matter what else is going on.
For giggles do a CTRL+ALT+DEL then select the STARTUP tab.
Disable everything in there that is not absolutely necessary (like your anti-virus…let that run). Sound is also common in there.
If not sure, disable it. It will not delete anything and you can go back and turn things back on if you want.
Once you have disabled almost everything in there restart your PC.
See if that helps things.
Actually I was already in the process of doing that
I even found a couple of things to outright uninstall.
Heheeh, while nothing was hot-swappable, they did actually perform pretty well. That was the least crazy crazy thing about that place, to be honest. When I can get all the loony into a coherent narrative, I’ll do a MPSIMS thread on it.
The CPU seems to be constantly running hot.
I would check that the heatsink is seated correctly with thermal paste.
Had a problematic computer once that had a loose heatsink on the CPU.
If you have Win10 right click on the task bar and check out what’s happening in the Task Manager that uses 90% of the CPU.
Happened again at 3:41 AM last night. Since I can’t take screenshots on a dead computer, the best I can do is phone photos.
I had Task Manager up sorted by CPU. Total CPU was 91%. The top processes were System at 27% Malwarebytes at 27% (probably doing a scan) and WMI Provider host at 16%. I don’t know what System was doing, or what that third process is. I also can’t figure out where the other 21% is coming from although it is probably a zillion processes at low levels.
Malwarebytes is a suspect for these nighttime outages since it is scheduled to scans every day at 2:49 AM. That doesn’t explain other outages, but I have turned off scheduled scans for now.
I also had HWInfo64 running showing that the cores are not maxed out but hot. It is possible that high temps are an issue although this machine has been running at least three years with no problems until recently. I could certainly try re-installing the CPU fan with new paste (I’m trying to remember now, this might have been one of those no-paste setups).