SQL server - mysterious heavy disk activity

At my work, we’ve recently had a few instances of heavy disk activity on one of our SQL servers, causing timeouts and general unresponsiveness. The hard drive lights on the front of the machine are pegged. In one case we were able to find a backup job that had fired at the wrong time; stopping it resolved the problem. Other times, we’ve been unable to locate any cause for the heavy activity.

Does anyone have any troubleshooting tips? Any easy way to isolate the origin of heavy disk activity?

Using task manager on the server, or using perfmon, you can monitor the disk I/O for each process running on the server. This might help.

Also, lock down the disks so you are sure no one is copying files to or from them over the network.

How many users have access? Do any of them have the ability to launch bad queries?

Every so often, our database owners (we have thousands of DBs) learn the hard way what happens when an inexperienced user is given select_catalog_role. :eek:

I’ve solved a few performance issues on SQL server with Performance Monitor, but they were all memory related. As crazyjoe suggested, you need to get some data from your server, exactly what’s going on, and perfmon.msc is a good place to start. Google on, say, “sql performance I/O site:microsoft.com” (without quotation marks of course), and you’ll probably isolate the problem with a few hours work.

It just struck me, while reading a cartoon somebody in another thread linked to, about tequila, that I actually had an I/O issue once on a SQL server. I was due to ill written reports - when somebody clicked “Show this report”, it sucked the hell out of the disks.

(With perfmon I saw that while there was unexpected paging going on, the disk read was close to beyond belief. With the monitoring perfmon gives, I saw that the disks was only overloaded for specific times, and then, investigating this with developers, marketing executives, et al, I could pin point this to an “if this report is run, the overload on the disks are awful; ergo, the report has to be re-written” - which the developers did.)