How Can One Speed Up the Connection to a Hard Drive?

My “friend” runs a small business which went paperless a few years ago. All of his files take up 50 to 100 Gb on his computer hard drive. He runs a daily backup to a portable hard drive using software which identifies and backs up only files which are new or have been changed.

In addition to the daily backup, my “friend” runs a weekly backup in which all of the data is backed up completely and fresh to a large external hard drive. This process takes a few hours. The external hard drive has several ports – USB, Sata, USB 3, etc. The backup is normally done with the USB port.

In hopes of speeding up the process, my “friend” went to a local computer store and was sold a SATA card which he installed in his computer. This sped up the process somewhat, but it still took a few hours.

Is there some way to speed things up a lot?

not likely. eSATA is about as fast as it gets, and he’s probably limited by the write speed of the hard drive itself.

yes he could have less data.

the SATA or eSATA is the fastest.

If you’re doing a full dump of one disk to another, you’re going to be constrained by the bottlenecks of:

  • read speed on the source device
  • write speed on the target device
  • throughput speed of the connection between them

Upgrading from USB to eSATA only solves problem #3. As jz78817 already said, more likely you’re constrained by #2 (since hard drives are built to read more quickly than write).

So if you really want to speed this up, try writing to multiple disks at the same time. Like if your source data has multiple sub-directories, A1 through A10 all roughly the same size, you would halve your backup time by copying A1 through A5 to Backup#1 and A6 through A10 to Backup #2, and so on. Three disks would go even faster, etc.

Now depending on how fast the writes are on your target hard drives, you may need to put them on separate controllers (rather than on the same controller card, like what’s already free on your motherboard) to see a big gain. And at some point you’d hit constraint #1, where you are swamping the source disk with too much read activity to serve the multiple concurrent write activities to their capacity, or possibly your computer’s CPU if it’s underpowered or doing something computationally intensive at the same time. But just doubling up the writes would speed things up for sure.

On the other hand, for a small business type of operation, I’d just auto-schedule this weekly full backup dump for something like 2AM on Saturday night and let it do its thing however long it takes, who cares as long as it’s done by the time I’d want to use the computer again.

For convenience, USB 3 is hard to beat. But, to echo robardin, is the time taken a real problem? Can he not set the backup going on (say) Thursday night so it will be done for Friday morning, then take the drive home on the Friday evening? He does have several drives which he rotates, doesn’t he?

Get a bunch of SSDs. Expensive, but they increase read speed tremendously.

Any reasonable modern hard drive should be able to write ~100MB/s sequentially, unless the accesses are really small. That should give you a backup of 100G in about half an hour.

However, the throughput drops down to about 1MB/s if you’re doing 4KB accesses randomly. I can’t imagine that there’s any reasonable backup software that’s doing a bunch of 4KB writes at random. Is he copying a bunch of really tiny files and not using software designed for backups?

Cite for speed estimates.

Worth noting that while the physical connection is as fast as it gets, the filesystem might still be slowing him down.

For example, it’s a lot quicker to copy a few large files than thousands of tiny files. You might find archiving the files in some way (.zip, .tar, whatever) makes the process go much quicker.

Seconded

any decent backup software can do scheduling, let it fire up a couple hours after close and if it takes a few hours, no big deal. Incrementals should run to a local drive in minutes unless he is dealing in huge files.

No it’s not. There’s a whole bunch of high bandwidth transfer techs that take over where eSATA stops. Thunderbolt, 10 Gigabit Ethernet, Infiniband, Fibre Channel etc.
I have a portable thunderbolt raid array for video editing that does 380 megabytes a second read and write, its 4 SATA disks in RAID5 mode.

For the OP, the bottleneck will be the single drive in his computer that he’s copying from, so none of the above technologies really make any sense. What he might want to do is to instead of backing up his computer, buy a small RAID system and store everything on there. That way the data is fully protected against a single disk failure without doing constant backups. RAID is cheap nowadays, you can get a 4 disk USB3.0/Sata enclosure for $400 or so and $100 each for 1TB drives so maybe $1000 gets your 3 TB of protected RAID storage space.

There’s a quirk with older hard drives where Windows will reduce the transfer speed if it detects transfer errors. It’s been a while since I’ve dealt with anything like that… but basically it happens with the older (parallel) ATA hard drives. A dodgy old hard drive can be bumped from the top transfer mode (UDMA 6, 133 MB/s) all the way down to the slowest transfer modes (UDMA 1 at 16 MB/s, or maybe PIO which goes down to 3 MB/S). Here’s a 2001 vintage article from Microsoft on the topic.

I’m pretty sure this isn’t possible with modern SATA drives, so the OP can ignore this post if the source hard drive is less than ~6 years old.

I think this is the best way, RAID 1 (mirroring) turns the backup into real-time instead of once a week, and it won’t affect performance. Ideally get a hot-swappable enclosure and 3 drives, and do something like this:
Week 1: Drive 1 and 2
Week 2: Drive 1 and 3
Week 3: Drive 1 and 2
Keeping drive 1 in use and alternating between drives 2 and 3. The reason for this is if drive 1 gets corrupted, so will the current backup drive in use, and you can just fall back to last week’s backup.

Thank you for the thoughts everyone, my “friend” will appreciate the proposed solutions.

The reason for the secondary backup is that my “friend” is worried about a Stuxnet-like virus which corrupts data for a long time without him even realizing it.

Thus, he is worried that his primary backup (which is done every day and takes only a minute or two) will result in good data being overwritten with corrupted data.

The upshot of this is that he prefers to be physically present while the secondary backup is running so that he can make sure his computer is not doing anything strange, especially to the data from previous secondary backups.

The way the backup is done is with a simple copy and paste from Windows.

Does that make sense?

The limitation of the speed is because the transfer is being done as a cut and paste. The operating system will simply copy the files one after another - reading the disk blocks from wherever they lie on the source drive, and copying each block, appending them one by one to the data blocks on the target. This is slow for lots of reasons. One, the disk accesses on the source drive will often not be sequential on the disk, and may include a lot of disk head movement. Head movement, and worse waiting for the disk to rotate the next needed data block under the head can be vastly slower than the disk is capable of under optimal circumstances. Two, the disk being written is going via the ordinary file operations, which are often not set up for quickly creating lots of small files, or simple appending. Using a proper backup utility can make a huge speed difference.

You might think the thinking about data integrity is paranoid. It isn’t. It isn’t even in the ballpark of paranoid enough. Almost any business now hangs on the integrity of its data. Lose that and you may as well shut up shop now.

Slow degrading damage to data is one of the most dangerous things that can happen. It isn’t just malware, although that can happen. Faulty software, inattention, faulty hardware, maliciousness, carelessness. List goes on.

No backup system should ever ever copy the backup data over the top of the last backup. At minimum use two separate disks, preferably more, and rotate between them. Keep the older ones off site. If you can keep the most recent one off site too, at a different site to the other backups. There no use at all in keeping data and backups together. A fire will wipe out both. Or you can be like a college of mine, who had the lot stolen. Computer, backup disks, everything. Sure the backups were in a cupboard. Didn’t help. Very very bad.

You need to be very clear about the difference between backups, and archives. You need both. Every now and again (say every month) make a copy of everything, and put it away forever. You can do much better than this, but it is a start.

I would seriously look at commercial backup software suitable for a small business. Sadly I can’t offer any advise for Windows, but I’m sure many others can. Some systems will create a pure clone disk as a perfect backup. These are useful because you can drop them straight back into a system and get on with work. (Although your first job will be to create a new backup.)

When I used to supervise systems guys, I used to tell them this - it doesn’t matter what is going wrong, the email could be stuffed, the printers not working, people unable to login, high powered executives on the phone, if you cannot guarantee the integrity of the company’s data as your first priority you are not doing your job.

Do not compromise the integrity of your data.

Data recovery can cost a lot of money. Many thousands is easy. And that assumes you have the disks you need to recover the data from. It takes time too. Time the business is not able to operate. No-one ever overestimates the world of pain that comes from a data loss.

:eek::eek::eek:

Windows has a built-in backup facility. It’s in Control Panel in Windows 7, and under Accessories, System Tools in Windows XP.

As has been said, this is a valid concern. However, the friend’s solution is not a good one. What you want is some kind of differential backup system that will keep historical versions of files. In other words, it makes one big backup on the first day and on each further day, it saves only the changes made that day. You can restore any version of the file from the very oldest to the newest and anything in between.

I also use two backups. One works as I just described and is done off-site. The other is for Windows Restore on a local hard drive. This not only gives me a second way to recover data, but should get my computer back up and running faster than waiting for download/install of the online backup. Both backups run at night when they don’t bother anyone.

One of the major weaknesses your friend’s system has is that it sounds like everything is done on external hard drives. I would strongly recommend something online in addition to the external drive. It’s too easy to leave portable drives behind with your computer and then theft, flood, fire, etc. can destroy both the computer and the backup.

Using Windows backup utility is like using a sledgehammer to drive a finishing nail. It’s woefully underfeatured - no options other than ‘copy this portion of the hard drive to another drive.’ You can’t select parts of the drive to ignore, it doesn’t do incremental, takes up way more space than necessary, has to be run on a schedule rather than monitoring for changes, etc. It’s fine for the average windows user, but if you want to do anything useful, it sucks.

False.

False.

False.

What does this mean?

True.

it’s not nearly as limited as you make it out to be.

If he is doing this rather than using a software solution he may like to try something like

All ways sync

It can be set to do dozens of backup tasks across multiple drives for the type of backups he is doing.

I am also a fan of doing things like having a rotating mirror set. Mount an externally accessible hard drive bay and make the drive part of a RAID-1.

Pop out drive, insert blank drive, allow raid to rebuild. You are now holding a bootable bare metal backup of your system. You can do this with as many drives as you can afford, its damn near bulletproof and will get good backups of things like exchange archives and SQL databases that many basic backup packages choke on.

The downside is, you have to be careful so as not to rebuild the RAID from the old drive to the new. This is usually simple but can be fucked up easily if you are not paying attention.

Well my “friend” does something like that. Every two weeks, all of the data is copied fresh and is never over-written.

The portable hard drive is carried in his briefcase and goes home with him every day.