Raid errors. How can I confirm the problem is the drive and not the raid controller?

I’ve been having raid errors lately. I have two 500gb drives in raid 0 config.

Since I’ve been getting the errors I’ve been marking the drive as ok and just carry on. Everything seems to work fine, but the errors are becoming more frequent. And always on the 0 channel (suggesting always the same drive)

Today I decided to schedule a disk check (vista). I started it at 5pm. When I checked it again at around 8pm it seemed not to be moving from ‘checking cluster 362296’) I checked again shortly after midnight. It hadn’t moved from 362296 so I aborted it (hard reboot the computer by holding the power button in)

This may sound like a stupid question but I am going to ask anyway: What would happen if I physically swapped the drives? (swap the cables).

Reason I ask is, if I put the suspect drive on channel 1 instead of 0, and then the errors get reported on 1 that will confirm that the problem is the drive, and if the errors get reported on 0 that will confirm the problem is the controller.

I’m fairly sure it IS the drive, but I don’t want to go ahead and buy two* new drives only to find that the problem was the raid controller.

[sub]*Google tells me these drives have problems anyway, so I don’t want to just replace the faulty drive with a known-to-be problematic identical drive. I want to get two new differently branded drives[/sub]

Of course the first step, if you want to keep the data and haven’t done it already, is to image your drive/s. If your error rate is that high the whole thing can die at any minute.

You should be able to swap the channels without creating any additional problems. Also if you don’t have any extra cables you can just do a straight cable swap to see if the problem follows the cables.

Really though, with the size and speed of modern drives raid 0 has little practical benefit with a relatively large cost in reliability. Just say no to raid zero. :wink:

I don’t have enough room on anything to image the drives.

But I do have room on various devices to keep what I want to keep, so I’ll be copying stuff to those devices tomorrow or when I get time.
If there isn’t much speed benefit in raid 0 then I could just stick to using the healthy drive (after reinstalling everything onto it, obviously. I doubt there’s a way to ‘transplant’ the data off the unhealthy drive onto the healthy drive… is there???)

Not without copying it off to somewhere else first. IMO if it’s not the controller or cables then both drives, having the same history, are equally unreliable and require replacement. I assume they are no longer under warranty?

Raid 0 is striping - data is distributed across the two disks to improve performance, and there is no redundancy. Given the simplicity of most low-end RAID controllers, I doubt that you can swap the cables - the controller probably assumes that data starts on channel 0 and continues on channel 1. Swapping the disks will confuse the hell out of it and you will end up with no data.

There will also probably be no way of merging the data, without copying the disks off on to an external device.

Si

Are you sure you mean RAID 0 and not RAID 1? RAID 0 is striping and incredibly inappropriate for most normal situations. Go buy yourself a 1-2 TB drive and use that instead. Put the non-failing drive in a USB enclosure.

I definitely do mean raid 0. I wanted the increased speed.
Now I can’t decide whether to buy two new drives and raid0 them (people at work say raid0 IS faster than non-raid) or take the advice of people in this thread and buy a large non-raid drive.

RAID 0 is faster, but unless you’re doing disk-intensive work, you really won’t notice it. If you want a fast disk, get a SSD. You’ll notice that!

The big problem with RAID 0 is that if one part fails, it all fails. This doesn’t particularly matter when you’re using the area as temporary storage, but for longer-term storage, it’s not good.

SSD is expensive, for a small amount of space.

I think what I’ll do is sacrafice the extra speed for the simplicity of non-raid. And appease myself by getting some faster and more ram.

I’m now running on a single non-raid 500gb drive. And it does seem slower (windows takes a noticably longer time to startup. Apps seem to take longer to load - even firefox)

I had one of those ‘media players’ that’s just a drive inside a box with connectors for TVs and some basic software inside to run it. I decided to have a look what drive is inside. Turns out it’s a fairly decent seagate (barcuda) 500gb drive. So I’ve liberated it from the player and re-imprisoned it in my desktop PC.

I’ve used it to backup some stuff I wanted to keep.

Then I reset the two raid drives to non-raid.

Took out the faulty drive.

Installed Vista on the other drive (the former raid drive)

So, either I’ll get used to the apparent slower-ness, or it’s just an illusion (it does actually score lower on windows’ performance thing - 5.8), or I’ll buy an identical drive and re-raid them, or buy two new different brand drives and raid them.
My raid0 drive(s) scored 5.9 on windows. But since 5.9 is the ‘11’ on the volume control that gives no indication of how much further than a score of ‘1’ it is better than the non-raided drive.

Right. I’ve now plugged in the bad drive and am currently running the check for errors thing on it. It’ll take a while but at least it’s letting me run it in windows where I can do other stuff. Before I abandoned the raid0 setup the disk check at startup ( pre-windows, or as I still like to call it ‘dos’) would take a stupidly long time or just hang at one exact point. It never ever completed (because I always ended up rebooting to get rid of it) I suspect chkdsk or fdisk or whatever vista uses for its ‘dos’ check doesn’t cope well with two drives raided as one and partitioned as two.