A few more years and it will really be able to drive!
Is it just me or did desktop hard drives really started to get more reliable about a decade ago? I don’t hear people talking about them failing and taking everything as often as I used to.
This old geezer is a Western Digital Caviar Green 1TB SATA. I use it as my daily backup since it is the oldest.
I also have another WD black that is 6 years old and two Intel solid states (8 and 5 years old) that are still in use.
I still believe that leaving the computer on 24/7 is the best for hardware. I haven’t had a failure in any component (except a case fan) in 10 years.
This is for desktops at least, I don’t trust laptops. They smash so much equipment into so small a space. Compromises are made. And then it gets banged around.
Even accounting for that, most electronics have always followed a sort of pattern where they would likely fail early, like the first time they’re ever powered up, or right around then, or they typically last forever.
Hard drives have always had a more complicated relationship with reliability, in that there are actual moving parts and some non-trivial amount of vibration as a result. That doesn’t do the electronics any favors.
And anecdotally, the 24/7 idea is solid from three things I’ve experienced:
One, I used to work for a company whose business was to manufacture burn-in and test equipment for semiconductor manufacturers. At one point in the late 1990s, ALL Intel Pentium chips being produced were run through equipment we manufactured as a final QA step prior to shipping. As a kid right out of college, I asked the engineers what the story was on leaving computers on vs. turning them off- they were unequivocal and unanimous in their answers that leaving them on is the right thing to do. In their view, it’s the thermal and electrical cycling/shock that does stuff in, not running at a relatively steady temperature and current.
Two - I’ve had a few jobs (years ago) where the companies I worked for ran their own server rooms / data centers. We almost never lost any hardware in normal operation, but almost every time we powered equipment off, we lost at least hard drive or some other server component.
Three- I’ve had experience like you- if I keep the thing on 24/7, things rarely fail, other than the occasional PSU fan bearing or CPU fan bearing.
And yeah, laptops have been dramatically less reliable than PCs in my experience.
I have a computer from 1989 in my basement that I keep as a curiosity. It has an 80MB hard drive and still works.
Regarding reliability, in those days the hard drives were stamped with a list of defective sectors. It was just understood that your brand new OEM drive might ship with bad sectors.
Isn’t this still done? They try to make things to a certain yield, but know that some of it will fail. The block off the sections that are bad to not be used and sell it as a lower capacity. Or is this just with multi-core processors?
When I buy a hard drive it doesn’t include a defect report. The one from 1989 has a defect report sticker on the front of the drive. That doesn’t mean there are no defects but I don’t know what the current practice is.
All hard drives, then and now have bad blocks/sectors. You don’t know about because there’s a certain number (hundreds or thousands in modern drives) of spare sectors allocated for replacement. Even floppy disks have this allocation, though of course much smaller. Windows CHKDSK and SCANDISK, which is all most users had in the '80’s and '90’s, is overly aggressive with marking blocks/sectors as bad and they’ll that space won’t be used. This is why we were more aware of bad sectors back then.
All modern hard drives and SSDs, have S.M.A.R.T., which continually monitors the status of the drive. Programs like CrystalDiskInfo report S.M.A.R.T. readings and will alert you if all the spare sectors have been used and new bad sectors develop. If even one bad sectors is reported, that means that there are hundreds or thousands (the manufacturers don’t disclose how many there are) other bad sectors that are unreported. Bottom line is once even one unreallocated bad sector is reported by S.M.A.R.T., your drive is no longer within spec should be replaced immediately. One you have even one bad sector, even if the count doesn’t increase, it means that if/when more develop, you don’t have any more spares and you may writing/reading from a bad sector that just hasn’t been flagged.
This is related to the speculation that hard drives are sold as smaller capacity drives if there are too many bad sectors. So far it’s strictly speculation and unlikely to be true. At least as the reason for being sold as lower capacity. Bad sectors may be higher in certain areas of the drive platter(s), but they can be anywhere on any platter. The drive firmware can be tweaked to skip those areas, but since any unnecessary head movement (which is why you have to defrag hard drives) is a performance hit.
There is strong evidence that some/most? of the drives in externals (which are cheaper than buying a bare drive) are binned. That is, they didn’t pass the full testing to be sold a as a full price internal, They’re still fully functional, just not necessarily up to full spec.
Last month I retired a Toshiba laptop on its 10th birthday. Admittedly it spent the last 8-ish years as a de facto desktop spinning (ref @Hermitian) 24/7/365.
Its 5400 RPM ~350GB HD was doing just fine at least as far as Windows’ error logging was concerned. Ref @Didi44, I hadn’t looked into the SMART logs.
But dayum was that thing slow!! Both laptop & HD were mighty pokey for a 2020 software stack & workload.
Bottom line: Yeah, reliability of 2010-era HDs is (was?) way up vs. predecessor devices. Whether by dint of more spare sectors or simply better fabrication I have no clue.
We won’t know until 2030 whether the same can still be said about 2020 hardware, or whether the commodification led to cheapification & early disposability.
For every report of X hard drive lasting well beyond the warranty, there are Y stories of drives dying prematurely. Too many variables, particularly heat and vibration to make any solid connections.
Comparing the past to the present is apples to oranges. In the past people had one or two hard drives. Now, people use a NAS, with or without RAID with multiple drives and if you follow backup procedures, you have 1 or 2 backup drives for every main drive.
With the multiple billions of drives in use at this very moment, reports of truly bad drives would pop in the millions, as with the infamous IBM Deskstars and the Seagate Barracuda. Conversely, reports of long lasting drives are much rarer because they’re typically replaced by larger, cheaper drives.
I have some IDE drives with files that are over 15 years old sitting in a box. If I powered them up, I could say OMG! These drives have lasted 15 years! Hmm…thinking about it, if I can find my IDE to USB adapter, I should fire them up just to see what’s on them. My own little time capsule!
Bottom line. While the reports of OMG! my drive is still going is interesting, it’s purely anecdotal.
As for stories of drives dying without notice, if you use software like CrystalDiskInfo to monitor S.M.A.R.T., It’s likely you’d be warned of conditions leading to failure. If you see a yellow caution button, plan to replace the drive immediately. It may last 10 seconds, 10 months or 10 more years, but that’s true even if you have all green.
If you want failure stories, I’ve had over well over a dozen drives die in the last decade. The most recent one being a 8TB drive that had a handful of bad sectors. I was using the drive for non-critical use and it may have died due to a faulty power supply unit, but it died suddenly.
Sounds like a big number of drives that failed, but inconsequential considering the dozens of drives I’ve gone through over that same period.
Sorry for the post flood, but this is a topic that I’m very interested in.
If you’re interested in large, though still miniscule compared someone like Google or YouTube, which don’t publish stats, Cloud provider BackBlaze produces quarterly reports failure rates on their currently ~140,000 drives. https://www.backblaze.com/blog/backblaze-hard-drive-stats-q2-2020/
Wow…super low rates failure rates! What does it mean for home users and overall drive reliability? Absolutely nothing! The drives they use are primarily Enterprise which are designed for 24/7 use in a high heat, high vibration environment. And since they typically buy drives in lots of 100’s or 1000’s of drives at once, if there is bad run of drives in their bulk purchases, it’s more likely they’ll hit a bunch of them all a once.
In 2016, they bought 48 Seagate external 8TB Seagate Archive drives designed for home use. They did it as a test because they were significantly cheaper than Enterprise drives. The drives quickly began to fail due to heat and vibration and they were all pulled quickly from use.
I have/had about 8-10 of these same drives from the same period and about half of them have died. Does that mean they’re all bad? No. Just that they outlived their 2 year warranty.
Well, I would say there is a big difference between booting up ones that have been siting in a box for years vs ones that have been running and in use.
Absolutely, that is why it is in MPSIMS. It just seems like 15 years ago I used to hear about someone’s HD dying and taking their thesis or pictures with them on a yearly basis. I haven’t heard that in years. Maybe everyone got better at backing things up (cloud, etc). Maybe HDs are more reliable. Maybe my data set is just way too small.
In four years half of them died?! That is some pretty bad luck for you or bad drives overall.
“Absolutely, that is why it is in MPSIMS. It just seems like 15 years ago I used to hear about someone’s HD dying and taking their thesis or pictures with them on a yearly basis. I haven’t heard that in years. Maybe everyone got better at backing things up (cloud, etc). Maybe HDs are more reliable. Maybe my data set is just way too small.”
A bit of all three. It’s cheaper to backup to another drive or the cloud for small amounts of data, say <5TB. Drives do have better built quality than before, but there’s a tradeoff because they also hold more data. And yes, there’s confirmation bias. Too small a sample (see my response to half of my drives dying) and people don’t talk about losing data as much because backup solutions are so easy an cheap. I’m a regular at reddit.com/r/datahoarder and questions and complaints about losing data are quickly answered with “Should have backed it up!”.
“In four years half of them died?! That is some pretty bad luck for you or bad drives overall.”
Too small a sample to make any conclusion. I’ve lost count of how many drives have died in the past decade, much less past 35 years. I replace my drives ASAP after the warranty ends and move the ones out of warranty to backup or spare, so for me the drives served their time.
In the next ~3-5 years, I’ll probably replace all my 8TB drives (the smallest I have in active use) with larger ones whether they’ve died or not.
To be clear, your MSIPS post is interesting and I don’t mean to threadcrap and I hope it’s not taken that way. Just stating the facts as I know them and stating my opinions.