Does exFAT take up significently more storage space than NTFS on an external HD?

Atomic_Alex · August 12, 2019, 8:22pm

So a couple of years ago I set up a 6TB RAID 1 system which is about getting full, this is formatted in NTFS.

A month ago I bought a 10TB external HD and I’ve been copying everything across, this is formatted in exFAT.

I’ve noticed that the 10TB drive is almost full up and I haven’t even finished copying everything from the 6TB drive. Should this be the case? If so should I just format the 10TB in NTFS format?

Is there any particular advantage to exFAT over NTFS anyway?

Thanks

DPRK · August 12, 2019, 8:45pm

Do you have loads of really small files? What cluster size did you use for the exFAT system? Sometimes they are formatted with a cluster size of 256 kB. The 60% slack you are describing does not sound reasonable, though.

ETA I don’t know why exFAT would have any particular advantage in this case over just using one of XFS, NTFS, ZFS, EXT4, etc

scr4 · August 12, 2019, 8:58pm

This page says default cluster size for a 32GB-256TB exFAT partition is 128 KB. In order for that to explain >4TB of wasted space, you’d have to have about 60 million files.

Atomic_Alex · August 12, 2019, 9:06pm

Yes, I have tons of small files. I’m not a hoarder in real life but I am one on the internet, I’ve saved tons of stuff ‘which I’ll read later’ pretty much from first starting to browse the web.

I’ve checked and the cluster size is set to 1024KB, good? Bad? Indifferent?

And yes to be honest I used exFAT on this drive instead of NTFS pretty much just to try the different format, I figured following the inevitable technological apocalypse I’d be able to use at least one of the drives.

See above!

jz78817 · August 12, 2019, 9:46pm

1024k is pretty big but that might be dictated by the size of the disk. If you have lots of small files there’ll be a ton of wasted space; the cluster is the smallest unit addressable by the file system. So a 5 kB file will take up an entire 1024k cluster.

scr4 · August 12, 2019, 9:49pm

Each file is stored in multiple whole clusters. If the cluster size is 2024 KB, and the file is 2025 KB, it takes up 2 clusters. So the larger the cluster size, the more wastes space.

rbroome · August 12, 2019, 10:06pm

Isn’t that also true of NTFS?
So the issue would also be on his old drive. Given lots of small files, both drives would be inefficient, though to different amounts depending on differences in cluster size if any.
Doesn’t NTFS use a fixed cluster size?
In fact, don’t all file systems?

scr4 · August 12, 2019, 10:08pm

Yes, but they typically use much smaller clusters - default cluster size for a 2-16TB NTFS partition is 4KB.

dasmoocher · August 13, 2019, 2:01am

Doesn’t seem relevant in this case, but isn’t an advantage of exFAT is that it can be read by Windows, OS, and Linux?

DPRK · August 13, 2019, 2:04am

Nah, filesystems that don’t use blocks, for example. But even among file systems designed for use on block devices, there are various tricks to reduce wasted space due to internal fragmentation, like variable block sizes, block suballocation, block compression, inode inline data (ext4 and NTFS support this), and similar.

If it’s really going to save multiple TB, maybe it’s worth switching to one of these filesystems with advanced features (ZFS even has its own software RAID built in). I gather the OP really has tens of millions of files? But even just NTFS’s smaller cluster size and use of the MFT may account for the bulk of the difference.

DPRK · August 13, 2019, 2:12am

It’s also proprietary AFAIK. At the same time, NTFS can also be read and written on Linux, Mac, etc. One interpretation is that exFAT is just meant to be a relatively simple file system that works on high-capacity SD cards.

rbroome · August 13, 2019, 2:47am

Thanks.
I am not familiar with such systems. Can you give an example of a file system that doesn’t use blocks? I would like to learn more.

DPRK · August 13, 2019, 3:44am

Count key data for example…

The thing is, devices such as flash memory and hard disk drives in use today do have physical blocks or sectors that define a minimum record size that can be read or written. This makes some sort of sense: you can’t write, say, a single bit or byte onto the end of a magnetic tape; it will be wrapped up in some sort of error-correcting code and other low-level data. So, in any case, typical filesystems have to deal with 512-byte or 4K or whatever, depending on the medium and format. Even the CKD mentioned before is these days virtualized from the physical layout.

What you do have are the tricks mentioned before, so if the sector or block size is 512 bytes and you create 1000 files of 60 bytes each, instead of allocating 512000 bytes plus metadata, you would probably save the entire 512 kB or close to it. So it seems the OP should NOT use exFAT and definitely not with 128 kB clusters.

markn_1 · August 13, 2019, 5:35am

I once designed a filesystem for a NOR flash device. NOR flash has the interesting property that you can overwrite one-bits with zero-bits, at any address granularity. You can overwrite a single bit in the middle of a word anywhere in the flash, fairly freely (I think there was some maxinum number of times you could overwrite a bit without erasing but it was fairly high). NAND flash doesn’t work that way – you can only overwrite a block at a time. I took advantage of this feature of NOR flash in a number of ways in the filesystem design, one of which was that new data is simply appended to a sort of log of writes. So if you write 57 bytes to a file, a record is allocated containing those 57 bytes plus a header and is written to the end of a table of such records in flash (overwritting a bunch of 1’s). This design doesn’t really use any notion of allocation blocks.

joema · August 13, 2019, 9:53am

NTFS can only be read (not written) on Macs. So the only out-of-the-box filesystem that’s compatible between Mac and Windows is exFAT. The 3rd-party Paragon driver enables NTFS read/write on Mac. Likewise there is a similar one for Windows which enables read/write of the Mac HFS+ filesystem.

In general I’d recommend using NTFS on Windows because it’s a journaled or transactional file system and is more resilient to fragmentation. In case of an abrupt or uncontrolled shutdown there is less chance of filesystem damage with NTFS. On Mac the same situation exists with HFS+ vs exFAT.

The OP issue is likely caused by the difference in default cluster size between NTFS and exFAT, combined with having many small files. For his 10TB drive the exFAT default cluster size is 128KB. For NTFS it is 4KB. So on exFAT each small file could be wasting nearly 128KB each:

https://support.microsoft.com/en-us/help/140365/default-cluster-size-for-ntfs-fat-and-exfat

DPRK · August 13, 2019, 1:18pm

1024 KB, according to Post #4. He didn’t use the default cluster size. So, yeah, it may be worth him re-formatting as NTFS (which can store small files and directories in the Master File Table, anyway). If he really needs read/write interoperability with Mac, you’re right, it won’t work out of the box, but there are the free and paid drivers you mentioned.

Also, if the RAID is connected as NAS, there is no reason to worry about compatibility since the Windows/Mac/Linux machines can all access the data over the network. I formatted mine as ZFS; I feel like it is more robust than NTFS for large sets of data, as each block is checksummed.

Reply · August 16, 2019, 2:17pm

If you switch to NTFS, you can also turn on Window’s native file compression: https://www.howtogeek.com/133264/how-to-use-ntfs-compression-and-when-you-might-want-to/

If by “tons of stuff you’ll read later” you mean tiny text files and HTML pages, file compression might make a huge difference for you (40-50%) at minimal performance loss, since you’re mostly just hoarding and rarely reading them anyway. If you actually meant millions of images, like porn, file compression won’t help (jpeg is already compressed). This should work with RAID too, though I suppose at some conceptual level it would be harder to ignore an error in a compressed file than a plaintext file, though you’d hope this is the sort of thing that RAID error checking would catch and fix to begin with.

echoreply · August 16, 2019, 3:19pm

Well implemented compression, such as lz4 on ZFS, can increase performance, because the slowest part is reading and writing data to disk. Reading a small amount of data and decompressing it (or compressing and writing) is faster than reading a large amount of data, because processors are so much faster than disks.

pmwgreen · August 16, 2019, 8:34pm

As a simple space saver, can you simply zip up whole directories? The wasted space is per file, so even if you used no compression, the transformation of many files into one should save an enormous amount.

Reply · August 17, 2019, 12:05am

Yeah, while there are circumstances where this may be true, for the OP’s archival purposes (write once, read very rarely) it probably won’t really matter either way. The performance change should be negligible, faster or slower.

Filesystem level compression with a saner block size is better for this. Zipping up whole directories makes it harder to do things like search indexing, online backups (a lot of providers still don’t do delta syncs on files), etc. It also makes that one huge zip file more vulnerable to data corruption, potentially affecting multiple files at once. Though, again, it’s hopefully less of an issue with a good RAID array.

Topic		Replies	Views
Fat32 cluster size question Factual Questions	8	4114	October 8, 2002
FAT32: "Size" vs. "Size on disk" Factual Questions	4	1363	February 15, 2005
External Hard Drive: NTFS or FAT32? In My Humble Opinion	10	2117	October 18, 2010
Hard Drive Capacity - why does it say it's full, when it really isn't? Factual Questions	7	2092	November 22, 2008
new external hard drive: NTFS or Fat32? Factual Questions	18	2047	April 30, 2007

Does exFAT take up significently more storage space than NTFS on an external HD?

Related topics