The big server farms serving The Cloud – what do they use for primary data storage?
Do they merely install thousands of 3.5" hard drives, the kind that work in desktop PC’s? Or is there a larger form-factor just made for server farms?
The big server farms serving The Cloud – what do they use for primary data storage?
Do they merely install thousands of 3.5" hard drives, the kind that work in desktop PC’s? Or is there a larger form-factor just made for server farms?
Larger drives would have a longer random-access time, so they wouldn’t be good for servers. As far as I know off the top of my head, the Quantum Bigfeet were the last of the larger form-factor drives.
Until recently the Disk Arrays used SCSI drives. Now they are moving to solid state (SSD). Network Attached Storage (NAS) do use 3.5" drives (SCSI) & 2.5" drives (SSD). They are RAID protected for both data redundancy and improved transfer speed.
They usually use 2.5" drives because you can fit more in the same physical space. Physical space is one of the main restrictions for a data center.
But 3.5" drives come in up to 20 TB, so that is pretty fucking huge anyway.
Sure is, but that’s pretty new tech. I imagine server farms don’t replace their entire stock very often – maybe gradually, or every few years? So a lot of working stock must be a few years old.
So, to answer my OP, server farms use standard (perhaps high MTBF) disk drives, the same kind I might put in my PC or business server?
Question about 2.5" vs 3.5" drives…given all factors (heat, cubic size, access rate, data capacity) are the smaller drives more efficient for server farms? I realize 2.5" drives and SSD drives make more sense for laptops, where portability is the primary factor, but server farms are less concerned with that than cost/byte and power costs.
SSDs are significantly faster, use less power, produce less heat (fairly minor though compared to traditional drives). They are getting a lot more common. Very few PC/Laptops ever had SCSI drives, these are much faster than traditional hard drives. SSD is a game changer though. The prices keep dropping and eventually standard rotating drives will be gone.
For some real numbers, look at this page. It’s a report from Backblaze, a company that offers cloud backup services, that lists the exact models of drives in use and how well they perform.
BTW: There are some huge and very expensive SSD drives already. 50, 60 and even 100TB drives. These are usually 3.5" or even 5.25" drives.
The last one I saw was just standard drives in standard rack mounts… racks and racks of them. In a very chilly room.
SCSI drives were significantly faster when used in a multitasking, multi-user environment. In a single user environment, like a typical laptop or desktop PC, you didn’t really see a performance advantage with SCSI, making it difficult to justify SCSI’s higher costs.
This made for a natural divide, with SCSIs being used for servers and ATA (and later SATA) drives being used for laptops and desktops. High performance server drives were therefore almost always SCSI, where low-cost, budgetary drives were almost always ATA/SATA. This gave a lot of people the impression that SCSI drives just always outperformed ATA/SATA drives. But if you paid about the same amount of money, you could get roughly the same single-user performance drive in either SCSI or SATA.
SSD uses significantly less power and generally fits into a smaller form factor for the same amount of storage, but currently the cost per TB for SSD is much greater than the cost per TB for a traditional rotating drive. If things keep going the way they are then SSD does have the potential to replace traditional HDDs in server farms, but they aren’t where they need to be cost-wise for that to happen yet.
Reliability and longevity are also about roughly equal for both types of drives, so there’s no big incentive there either.
The advantages of 2.5" drives are they use less power, you can spin them faster so performance is better per drive, and you can fit more of them in a rack so the array performance is better (more drives = more sequential reads or writes = more performance.)
The primary advantage of 3.5" drives is that they are cheap per GB, but offer way better performance than tape.
It’s interesting to hear that. IMHO, SSDs’ long-term reliability would be mostly related to cell wear-out, but HDs’, mechanical bearing wear and surface integrity. The former is pretty new tech, the latter, quite mature.
On a related subject…Anyone care to contrast SAS drives with SATA? I recently ordered, by mistake, some SAS drives when I wanted SATA. The different connector gave me a clue, but the drives looked identical and were the same price. Why use SAS over SATA? Which do server farms use (until SSD gets cheaper, of course)?
20 TB is new, but 14 TB came out in 2017.
Never had a SCSI HD, but in the olden days, if you wanted a CD burner it was SCSI or nothing. (I also had a SCSI flatbed scanner.)
I’ve held in my own hands SAS SSDs, so there is more to SCSI vs ATA than just Winchester drive vs flash drive. (What, though? Supposedly SCSI drives still (theoretically) offer higher speed + features.)
Datacenters typically uses SAS because all of the hardware evolved from SCSI systems, which was the standard bus protocol back in the day. The other bits in the arrays like controllers are designed to use SCSI commands so it was easier to use SAS.
SATA had the speed to compete with SAS, but no one bothered to go to the extra expense of switching to it since there wasn’t really a benefit. SAS now has a speed advantage with 12GBit/sec, but few systems target that speed. People who wanted that performance aren’t on SAS any longer.
The real competitor to SAS is Fibre Channel. FC is faster than SAS and also SCSI compatible. SAS endures though since it’s cheaper and good enough.
I heard a comment recently (Xmas-time) from someone who works with a big server storage farm (connected to a supercomputer center) that at this point, he preferred the rotating disk drives, because that technology was more mature, and the monitoring software would give warnings before the rotating disk failed, so that they could replace it before complete failure. Whereas the SSD drives mostly just failed without warning, and that caused more difficulty (despite drive mirroring & similar protective measures).
An interesting perspective that I hadn’t considered before.
Unlike Backblaze, Amazon does not often talk about their storage systems. AFAIK, it is possible that Amazon is still using tape for cold storage, as they were 5 years ago.
Just to clarify it’s not SCSI vs SSD. The three main protocols today are SAS, SATA, and NVMe. Hard drives - 2.5” or 3.5” - use one of the first two. SSD can use any one of them. SAS and SATA use the same physical connector but SATA only controllers can only use SATA drives, SAS controllers can use both. SSDs also come in M.2 interfaces.
The big advantage to SAS is that they can connect to two interfaces at once via a single connector. This allows for redundant controllers to give you highly available access to the disk if a controller or cable fail, which is highly advantageous in a data centre.