I’m setting up a Linux box at home to use as a file server with two hard drives. Two options I’m considering are:
[ol]
[li]RAID-1 array (mirroring)[/li][li]Run rsync from cron every night to mirror one drive to the other[/li][/ol]
I believe option 1 has slightly better performance, since reading from a RAID-1 array is supposed to be slightly faster than reading from a single drive. It’s also a better protection against disk failure, whereas with option 2 I could lose a whole day’s work. But right now I’m leaning towards option 2 because it protects the data from user error, like accidentally erasing or overwriting old data. (As long as the lost data is at least a day old, and I notice the error before the end of the day.) Can you think of other reasons to favor one over the other? Or other options?
By the way most of the data is eventually backed up to CDR. The second hard drive is just a first line of defense.
If you go with RAID1, use a hardware implementation. Software implementations have a performance hit, hardware performance is identical to single-drive performance. RAID 0&1 capable EIDE adapters are common and cheap and keep your data mirrored at all times, even if there is a power interruption at night when your cron job is supposed to be running.
Good backup software would be the way to go to prevent loss of data due to user error. Do a full backup on a regular schedule (weekly or monthly, depending on the size of the data) and differential backups nightly between full backups. All good backup options are cabable of being scheduled and if you use an external medium for storage, you gain the additional protection of being able to store data off-site.
I would recommend RAID1, 0&1, or 5, depending on economic vs performance considerations, and backup software.
With mirrored drives you will never read from the second drive unless the first fails. This means there is no performance gain, but you will see a performance hit because data is being written twice.
I would just do the nightly backup and skip RAID 1. What you really need is a thrid drive then step up to RAID 5 and you get performance and redundancy.
RAID 1 is supposed to be able to be read by both drives. Also, write speeds are not affected (under hardware RAID) because both disks get written to at the same time.
If you really need that sort of data protection I would go with RAID 1 PLUS use a daily backup. That way you have both the data protection, and if you mess up you can restore from your other image.