Need opinions/advice on backup software

I need to backup about 100TB of computer files, and the amount will grow.

This data is (mostly) historically important, original, video & audio files, but not critical, medical data – no lives will be lost if the data is. But I would cry a lot.

Ideally, it would be a backup system that would run in the background, and keep a second copy of everything I use/create on a daily basis. It should have minimal impact on my daily work, so it would be fine if files weren’t instantly grabbed for backup – a short (24 hour?) delay would work.

I could dedicate an old computer to this task, but it would have to be very low-key to have minimal impact on my network traffic, internet and intranet alike. I would like to throttle or control the backup speed.

We should have some way of handling/logging errors, so the system doesn’t grind to a halt if one file is unavailable. Restoring files doesn’t have to be possible over the same link – if I had a major crash, I could drive to the 2nd location and copy everything to something portable.

Backing up to the cloud is not a good option right now – too much data to be affordable. But backing up over the internet to a 2nd site that I have access to is ideal. That site would have to operate as standalone as possible, and again, not grind to a halt if there was a single error.

FYI: I am currently using multiple NAS servers (Drobo & Synology boxes), locally, for backup. Although the Drobos are advertised as good for backup, and a single drive failure is handled very well, so far I have had 2 Drobos fail at the primary power supply level with no warning, rendering 100% of the data unavailable even if the disks are OK, so something must be done.

I would welcome opinions on this topic, especially if you have experience with this kind of situation. What backup software might serve me best?

Assuming your OS is Windows 10, the simplest thing to do is to set up a mirrored drive. That will protect you if your primary drive fails, but if there is a fire it won’t 't help you that much since it’s not an offsite solution.

Here’s a procedure for setting up a mirrored drive in Windows 10. I don’t know if it will work for you, but it may be an option.

I use a free, open-source program called FreeFileSync. It can run continuously in the background, on a schedule, or on an on-demand basis. It can do two-way synchronization and mirroring.

I’ve used a number of backup programs, and this is definitely the best one I’ve come across.

Although I haven’t tried it, I doubt that native Windows functions like a mirrored drive will work. We aren’t talking about a couple of measly 2TB drives in a single computer, but ~100TB at a nearby room and/or across town in NAS boxes.

Making an initial data copy is no problem. I can just copy from one source to a destination. It’s the more sophisticated options I need, like delaying data updates, controlling the speed, including or excluding specific directories, drives, etc. Right now if I copy large blocks of files thru Windows at whatever speed is default, it slows down everything, even my mouse cursor, and the traffic is noticeable on the network. I want this to be as painless as possible.

I’ll check out FreeFileSync, thanks.

This sounds like something for Drobo. Though I don’t know anything about them really (I just backup 2TB online so haven’t got your level of needs), the reviews I’ve heard are positive, and it seems to fit your needs very closely.

I have used the free version of Resilio for massive synchronization; worked fine.

ETA you can specify times of day and different speeds for each time block
ETA2 more recent versions are trying to push some of these features to the “premium” (paid) version, unfortunately. You can also try Syncthing, the GUI is less polished than Resilio

I have 4 Drobos, (two 8-units and two 5’s) and they work well for local, RAID-type backup. Drobo used to have a good backup program, but they discontinued it some time ago.

Drobo also used to have a great message board, with house techs and users giving advice, but that was shut down, too. I have a sneaking suspicion that the company might not be around much longer, as their innovation and product introduction seems to have slowed down or ceased. For that reason, I would rather use more generic programs and not rely on a single company, which is why I am adding Synology boxes to my equipment mix.

I took a quick, brief glance at FreeFileSync. It might provide some functions I need, but I see no provision for syncing over the internet. This might require a satellite computer monitoring a particular port at a static IP (which I have at the destination location), a task one step up from a local disk copy program. I hope to find something already written rather than doing it myself.

I also don’t see any throttle control at FreeFileSync. My work schedule is not fixed, 9 to 5, where I could schedule batch functions at an off time. I’m just as likely to be working at 3AM as 3PM. It would be much better if the backup operation worked continuously, but at a low, background level.

Whatever software you use - if you’re not backing up to the cloud - make sure that the device that the backup is on can be moved physically offsite at regular intervals. e.g. back up daily to one of two devices, and once a week take it elsewhere (e.g. safety deposit box), and swap them once a week.

If your house burns down - you don’t want to lose your backup as well as the original.

That’s the idea for additional off-site backup, so I don’t have to physically lug around an 8-bay Drobo every few days.

I agree that it seems like Drobo isn’t going to be in business much longer. Last month, I was thinking about buying one, so I checked Amazon, which had none. Then I checked the store at the Drobo website and every single product was sold out.

What are your cost constraints that make cloud backups unaffordable? 100TB is a lot of data to back up over the internet. There’s going to be either some cost, or some risk of data loss from something you rigged yourself. So how much can you pay?

Dropbox has a data throttle as you suggested you need. The unlimited tier AFAIK is $22 per month. If that’s out of range, you could probably cobble something together from various Amazon solutions such as Glacier and Snowball.

Also… what’s the offsite location that’s currently available to you? A home or office across town? Do you feel comfortable setting up network sharing between the 2 premises, and is the network in both locations reliable and controlled by you?

I have an office 10 miles away that is secure. I could set up a computer & storage there. It has a static IP, no caps, and sufficient bandwidth, especially if used mostly off-hours. This should be cheaper than cloud storage if I can build it from spare equipment. Yes, I control both ends, and it’s as reliable as cable internet can be.

The 100TB estimate is the total right now. Once that legacy data is backed up, I will probably add 50GB per month, some of which might replace/update the older stuff.

I thought the current supply chain interruption might be the cause of the lack of sale inventory, but here’s someone else who agrees that Drobo’s days are numbered:

100 TB of Amazon Deep Glacier is more like $100/month right now

Yeah I missed a decimal place on that. Actually it looks like $400/month for 100TB now that I looked at the pricing page. Not great.

I like ZFS and sanoid for large amounts of data, but that requires both ends to be running something that can use ZFS.

I’ve had excellent success with the free and open source UrBackup. It’s a client-server setup. The server side can run on just about anything from a Rasberry Pi to a Windows server. The server needs a static IP (DDNS would work), but I see you have that at the office.

The client initiates the connection, so the client can be anywhere. The client works great on Windows, less well on Linux, and even less well on Macs. If all you have to backup are Windows computers, then it is an excellent choice.

It does two types of backups: whole disk images and file based. That allows for bare metal restores, or just pulling out an accidentally deleted file. For your case, you could do something like image and file backups of your C: drive, and then full file backups of the rest of your data (unless you have 100TB all on C:!).

It’s configurable as to time of day for doing the image and full-file backups, and you can set interval frequency for incremental file backups. You can also set bandwidth limits, but that could also be something that QOS on your router handles. If you can get QOS working, then you should be able to backup at full speed, but then slow the backup automatically to let other traffic flow.

The other advantage is that you can stick the client on your laptop, your partner’s computers, etc., and centralize the backup of all of them.

One thing to be aware of is Windows complains very loudly when you try to install the client. The server will generate a pre-configured install for each client, so the software is not signed, and will have full access to your files. Once installed, it runs without complaint.

If you are a real DIY hacker, and have enough CPUs, disks, electricity, network, possibly multiple sites for redundancy then you could cluster a bunch of machines and maintain your own mini-S3 compatible storage, to be used with whatever backup software you prefer. Then there is no single server as a point of failure; you can set up your cluster so that each piece of data is replicated 3 times, for example.

To simply sync several Windows machines, Resilio/Syncthing will adequately do the job, IME.

An advantage of UrBackup over syncing is that it is an actual backup, not just a replication of what exists at the moment. It will let you keep historical versions of files going back as far as you have the disk space.

In my experience, when I need to restore files for people it is the version that existed months ago. I’m not recovering from a catastrophic failure, but rather from an accidental deletion or change that happened sometime in the last few months.

Syncing is great when your goal is to work on something at home, and have all of those changes replicated on your work computer. Syncing is very bad when your work computer gets infected with ransomware, and the encrypted files are synced to your home computer…

RAID \neq backup and sync \neq backup, but both are extremely useful tools to solve other problems.

What about tape backups? Until a few years ago, we used LTO tapes to backup the servers at my worksite and once a week, I was charged with swapping tapes and leaving the old ones in a locked box for Iron Mountain to pick up for offsite storage. Now we use some sort of NAS with, I think, some sort of backup among sites (though I have no idea how it works).

And according to Wikipedia, LTO-9 tapes can hold 18TB uncompressed or 45TB compressed. So it wouldn’t take a huge number of tapes to back up the OP’s 100TB. Would that work?

I just signed up for Dropbox’s unlimited storage plan, free to start, $240/yr to continue. When you consider how much work DB will go to maintain the data and access, that might be cheaper than rolling my own. However…

In my first DP test, backing up 50GB over the cloud, this process is taking 90% of my internet bandwidth, with 10 hours yet to go. I haven’t found any throttling setting yet – if you know where that is, please let me know.

In my younger days, I would jump at the chance to hook up a Raspberry Pi and cobble together a system; I’m more reluctant to do that now if something off the shelf is available. We live, learn, and grow old.

I will admit that using a Dropbox account might solve a future problem, and is attractive for that reason. I don’t plan to live forever, and want to pass along my data files to the local historian crowd. I have at least two people who I trust who could take over from me, and DB could be set up with those two having access to all the data right now. That would be an easy way to solve a looming problem.

OTOH, right now I am copying the same 50GB test data to a Synology in the next room, and it will take about 10% of the time the DB process will take. I’m thinking a manual backup process might be the best compromise right now, as long as I make sure to carry one backup box to the alternate location periodically, and swap it with the previous one I left there. Seems a shame to bypass automation, but maybe it’s just not ready for what I want yet.

RE: tape backup

I used tape backup for 15 years in the 1980’s. At the time, it was cheaper per byte than other media. Less so now; all tape units are proprietary, not a good plan for longevity. I have optical drives readily available; disk media, including SSDs, are, at present, readily interfaceable. But I have a box of tapes in my basement that no one can read; file access at best is difficult and slow. I’m pretty soured on tapes right now.