OK to do hard reboot during chkdsk on vista? Raid 0 write error.

Well I guess the obvious answer is no, but I’ve done it anyway, and So I guess my GQ (1) is - are computers these days designed to cope with sudden power loss in a hard reboot? (the kind where you hold the power button in until it turns off)

I got an error something like ‘raid 0 write error, it should be ok to backup your data’. Everything seemed fine. I rebooted the computer anyway, it showed ‘disk error’ on the pre POST raid screen. But computer logged into windows fine. I went into the intel raid disk manager thing and clicked ‘mark drive normal’. Did another reboot and the pre-post screen showed both disks fine.

SHould I be worried? - is the drive on it’s way out? Or am I just experiencing an inevitable one-in-a million disk write error and the raid controller is designed to flag this and ask you to backup data.

Anyway, going back to the original GQ. The reason I rebooted on running chkdsk was that I’d set my partitions (two partitions spanning one raid-0 drive (two physical drives)) to run chkdsk at reboot. I had no idea how long it would take!! 3 hours in and the first partition(250gb) had just finished. the other partition (750gb) was clearly going to take many more hours. I tried using the keyboard to stop it but it ignored me so I held the power button in.

Only, on reboot it began the chkdsk again.

I eventually got it to stop doing this by doing ‘last known good configuration’ on the f8 screen and all is tentatively well at the moment.

RAID 0 is not a good idea unless you have good backups. And yes, you should be worried: if either disk goes, you’re hosed.

I don’t need to back anything up. It would just be an inconvenience.

And of course I would be annoyed if one of the drives goes.

What I’m asking is - should I be worried that one of the drives might go?

In other words what are the chances of one of the drives going?

There is a 100% chance that one of the drives will go it just depends over what length of time you are looking at.

Is that sarcasm, or are you saying that now that there’s been an error, the chances of one of the drives going has shot up?

Could it be a controller problem?

No sarcasm intended. Raid 0 is just plain writing across all disks, there is no redundancy if one drive fails you lose everything in the raid (thus the suggestion by others to backup)

Every drive on the market will fail at some point in time and if your system has indicated a writing error then there IS a problem. The question becomes was it a localized problem to that one sector or is your drive or drive controller failing.

That we can’t answer.

Would chkdsk answer that? (If I let it run)

You’ve got to differentiate here between logical errors - which at worse can be fixed by a reformat - and physical errors, which are unfixable. If you’ve got a logical error, Chkdsk may well fix it; if you’ve a physical error, Chkdsk will mark the sectors as damaged and minimise the damage. The problem here is that you’re likely using IDE drives (maybe with a SATA interface but they’re still IDE). IDE drives should remap bad sectors automatically. If this is not happening, then you’ve filled the emergency area (there’s a specific term, but I forget the name) and ths is a harbinger of Bad News.

Check the Event Log - it should tell you which drive has problems.

chkdsk will try to move readable data from a bad area to good but the question still remains why the area is bad.

I know spinrite has been used to “refresh” drives resolving some logical errors. I’ve used it myself to recover some drives. Also if your drives support S.M.A.R.T enable them and you will get more data.

These are the errors for the day it happened…

“The shadow copies of volume C: were aborted during detection.”

“The device, \Device\Ide\iaStor0, did not respond within the timeout period.”

I googled the second, and got this link - http://www.intel.com/support/chipsets/imsm/sb/cs-025783.htm - which suggests it’s an issue with the intel storage manager and windows.
Just posting this for information.

Here is a good way to judge the kind of error you are seeing: It can indicate the drive is failing. If its not critical data or a production server, let it ride if you have backups. If its just something mission cirital I’d replace the drive at the first sign of trouble.