Reply
 
Thread Tools Display Modes
  #1  
Old 08-12-2019, 03:22 PM
Atomic Alex's Avatar
Atomic Alex is offline
Guest
 
Join Date: Sep 2009
Posts: 2,861

Does exFAT take up significently more storage space than NTFS on an external HD?


So a couple of years ago I set up a 6TB RAID 1 system which is about getting full, this is formatted in NTFS.

A month ago I bought a 10TB external HD and I've been copying everything across, this is formatted in exFAT.

I've noticed that the 10TB drive is almost full up and I haven't even finished copying everything from the 6TB drive. Should this be the case? If so should I just format the 10TB in NTFS format?

Is there any particular advantage to exFAT over NTFS anyway?

Thanks
  #2  
Old 08-12-2019, 03:45 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,479
Do you have loads of really small files? What cluster size did you use for the exFAT system? Sometimes they are formatted with a cluster size of 256 kB. The 60% slack you are describing does not sound reasonable, though.

ETA I don't know why exFAT would have any particular advantage in this case over just using one of XFS, NTFS, ZFS, EXT4, etc

Last edited by DPRK; 08-12-2019 at 03:49 PM.
  #3  
Old 08-12-2019, 03:58 PM
scr4 is offline
Guest
 
Join Date: Aug 1999
Location: Alabama
Posts: 15,954
Quote:
Originally Posted by DPRK View Post
Do you have loads of really small files? What cluster size did you use for the exFAT system? Sometimes they are formatted with a cluster size of 256 kB. The 60% slack you are describing does not sound reasonable, though.
This page says default cluster size for a 32GB-256TB exFAT partition is 128 KB. In order for that to explain >4TB of wasted space, you'd have to have about 60 million files.
  #4  
Old 08-12-2019, 04:06 PM
Atomic Alex's Avatar
Atomic Alex is offline
Guest
 
Join Date: Sep 2009
Posts: 2,861
Quote:
Originally Posted by DPRK View Post
Do you have loads of really small files? What cluster size did you use for the exFAT system? Sometimes they are formatted with a cluster size of 256 kB. The 60% slack you are describing does not sound reasonable, though.

ETA I don't know why exFAT would have any particular advantage in this case over just using one of XFS, NTFS, ZFS, EXT4, etc
Yes, I have tons of small files. I'm not a hoarder in real life but I am one on the internet, I've saved tons of stuff 'which I'll read later' pretty much from first starting to browse the web.

I've checked and the cluster size is set to 1024KB, good? Bad? Indifferent?

And yes to be honest I used exFAT on this drive instead of NTFS pretty much just to try the different format, I figured following the inevitable technological apocalypse I'd be able to use at least one of the drives.

Quote:
Originally Posted by scr4 View Post
This page says default cluster size for a 32GB-256TB exFAT partition is 128 KB. In order for that to explain >4TB of wasted space, you'd have to have about 60 million files.
See above!
  #5  
Old 08-12-2019, 04:46 PM
jz78817 is offline
Guest
 
Join Date: Aug 2010
Location: Under Oveur & over Unger
Posts: 12,036
1024k is pretty big but that might be dictated by the size of the disk. If you have lots of small files there’ll be a ton of wasted space; the cluster is the smallest unit addressable by the file system. So a 5 kB file will take up an entire 1024k cluster.
  #6  
Old 08-12-2019, 04:49 PM
scr4 is offline
Guest
 
Join Date: Aug 1999
Location: Alabama
Posts: 15,954
Each file is stored in multiple whole clusters. If the cluster size is 2024 KB, and the file is 2025 KB, it takes up 2 clusters. So the larger the cluster size, the more wastes space.
  #7  
Old 08-12-2019, 05:06 PM
rbroome is offline
Member
 
Join Date: Jun 2003
Location: Louisiana
Posts: 3,466
Isn't that also true of NTFS?
So the issue would also be on his old drive. Given lots of small files, both drives would be inefficient, though to different amounts depending on differences in cluster size if any.
Doesn't NTFS use a fixed cluster size?
In fact, don't all file systems?
  #8  
Old 08-12-2019, 05:08 PM
scr4 is offline
Guest
 
Join Date: Aug 1999
Location: Alabama
Posts: 15,954
Quote:
Originally Posted by rbroome View Post
Isn't that also true of NTFS?
Yes, but they typically use much smaller clusters - default cluster size for a 2-16TB NTFS partition is 4KB.
  #9  
Old 08-12-2019, 09:01 PM
dasmoocher is offline
Guest
 
Join Date: Apr 1999
Posts: 3,407
Doesn't seem relevant in this case, but isn't an advantage of exFAT is that it can be read by Windows, OS, and Linux?
  #10  
Old 08-12-2019, 09:04 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,479
Quote:
Originally Posted by rbroome View Post
Doesn't NTFS use a fixed cluster size?
In fact, don't all file systems?
Nah, filesystems that don't use blocks, for example. But even among file systems designed for use on block devices, there are various tricks to reduce wasted space due to internal fragmentation, like variable block sizes, block suballocation, block compression, inode inline data (ext4 and NTFS support this), and similar.

If it's really going to save multiple TB, maybe it's worth switching to one of these filesystems with advanced features (ZFS even has its own software RAID built in). I gather the OP really has tens of millions of files? But even just NTFS's smaller cluster size and use of the MFT may account for the bulk of the difference.
  #11  
Old 08-12-2019, 09:12 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,479
Quote:
Originally Posted by dasmoocher View Post
Doesn't seem relevant in this case, but isn't an advantage of exFAT is that it can be read by Windows, OS, and Linux?
It's also proprietary AFAIK. At the same time, NTFS can also be read and written on Linux, Mac, etc. One interpretation is that exFAT is just meant to be a relatively simple file system that works on high-capacity SD cards.
  #12  
Old 08-12-2019, 09:47 PM
rbroome is offline
Member
 
Join Date: Jun 2003
Location: Louisiana
Posts: 3,466
Quote:
Originally Posted by DPRK View Post
Nah, filesystems that don't use blocks, for example. But even among file systems designed for use on block devices, there are various tricks to reduce wasted space due to internal fragmentation, like variable block sizes, block suballocation, block compression, inode inline data (ext4 and NTFS support this), and similar.

If it's really going to save multiple TB, maybe it's worth switching to one of these filesystems with advanced features (ZFS even has its own software RAID built in). I gather the OP really has tens of millions of files? But even just NTFS's smaller cluster size and use of the MFT may account for the bulk of the difference.
Thanks.
I am not familiar with such systems. Can you give an example of a file system that doesn't use blocks? I would like to learn more.
  #13  
Old 08-12-2019, 10:44 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,479
Quote:
Originally Posted by rbroome View Post
Thanks.
I am not familiar with such systems. Can you give an example of a file system that doesn't use blocks? I would like to learn more.
Count key data for example...

The thing is, devices such as flash memory and hard disk drives in use today do have physical blocks or sectors that define a minimum record size that can be read or written. This makes some sort of sense: you can't write, say, a single bit or byte onto the end of a magnetic tape; it will be wrapped up in some sort of error-correcting code and other low-level data. So, in any case, typical filesystems have to deal with 512-byte or 4K or whatever, depending on the medium and format. Even the CKD mentioned before is these days virtualized from the physical layout.

What you do have are the tricks mentioned before, so if the sector or block size is 512 bytes and you create 1000 files of 60 bytes each, instead of allocating 512000 bytes plus metadata, you would probably save the entire 512 kB or close to it. So it seems the OP should NOT use exFAT and definitely not with 128 kB clusters.
  #14  
Old 08-13-2019, 12:35 AM
markn+ is online now
Guest
 
Join Date: Feb 2015
Location: unknown; Speed: exactly 0
Posts: 2,493
Quote:
Originally Posted by rbroome View Post
Thanks.
I am not familiar with such systems. Can you give an example of a file system that doesn't use blocks? I would like to learn more.
I once designed a filesystem for a NOR flash device. NOR flash has the interesting property that you can overwrite one-bits with zero-bits, at any address granularity. You can overwrite a single bit in the middle of a word anywhere in the flash, fairly freely (I think there was some maxinum number of times you could overwrite a bit without erasing but it was fairly high). NAND flash doesn't work that way -- you can only overwrite a block at a time. I took advantage of this feature of NOR flash in a number of ways in the filesystem design, one of which was that new data is simply appended to a sort of log of writes. So if you write 57 bytes to a file, a record is allocated containing those 57 bytes plus a header and is written to the end of a table of such records in flash (overwritting a bunch of 1's). This design doesn't really use any notion of allocation blocks.
  #15  
Old 08-13-2019, 04:53 AM
joema is offline
Guest
 
Join Date: Jul 2006
Location: Nashville, TN
Posts: 599
Quote:
Originally Posted by DPRK View Post
It's also proprietary AFAIK. At the same time, NTFS can also be read and written on Linux, Mac, etc. One interpretation is that exFAT is just meant to be a relatively simple file system that works on high-capacity SD cards.
NTFS can only be read (not written) on Macs. So the only out-of-the-box filesystem that's compatible between Mac and Windows is exFAT. The 3rd-party Paragon driver enables NTFS read/write on Mac. Likewise there is a similar one for Windows which enables read/write of the Mac HFS+ filesystem.

In general I'd recommend using NTFS on Windows because it's a journaled or transactional file system and is more resilient to fragmentation. In case of an abrupt or uncontrolled shutdown there is less chance of filesystem damage with NTFS. On Mac the same situation exists with HFS+ vs exFAT.

The OP issue is likely caused by the difference in default cluster size between NTFS and exFAT, combined with having many small files. For his 10TB drive the exFAT default cluster size is 128KB. For NTFS it is 4KB. So on exFAT each small file could be wasting nearly 128KB each:

https://support.microsoft.com/en-us/...-fat-and-exfat
  #16  
Old 08-13-2019, 08:18 AM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,479
Quote:
Originally Posted by joema View Post
The OP issue is likely caused by the difference in default cluster size between NTFS and exFAT, combined with having many small files. For his 10TB drive the exFAT default cluster size is 128KB. For NTFS it is 4KB. So on exFAT each small file could be wasting nearly 128KB each
1024 KB, according to Post #4. He didn't use the default cluster size. So, yeah, it may be worth him re-formatting as NTFS (which can store small files and directories in the Master File Table, anyway). If he really needs read/write interoperability with Mac, you're right, it won't work out of the box, but there are the free and paid drivers you mentioned.

Also, if the RAID is connected as NAS, there is no reason to worry about compatibility since the Windows/Mac/Linux machines can all access the data over the network. I formatted mine as ZFS; I feel like it is more robust than NTFS for large sets of data, as each block is checksummed.
  #17  
Old 08-16-2019, 09:17 AM
Reply's Avatar
Reply is offline
Member
 
Join Date: Jul 2003
Posts: 8,623
If you switch to NTFS, you can also turn on Window's native file compression: https://www.howtogeek.com/133264/how...might-want-to/

If by "tons of stuff you'll read later" you mean tiny text files and HTML pages, file compression might make a huge difference for you (40-50%) at minimal performance loss, since you're mostly just hoarding and rarely reading them anyway. If you actually meant millions of images, like porn, file compression won't help (jpeg is already compressed). This should work with RAID too, though I suppose at some conceptual level it would be harder to ignore an error in a compressed file than a plaintext file, though you'd hope this is the sort of thing that RAID error checking would catch and fix to begin with.
  #18  
Old 08-16-2019, 10:19 AM
echoreply's Avatar
echoreply is online now
Guest
 
Join Date: Dec 2003
Location: Boulder, CO
Posts: 859
Quote:
Originally Posted by Reply View Post
If by "tons of stuff you'll read later" you mean tiny text files and HTML pages, file compression might make a huge difference for you (40-50%) at minimal performance loss, since you're mostly just hoarding and rarely reading them anyway. If you actually meant millions of images, like porn, file compression won't help (jpeg is already compressed). This should work with RAID too, though I suppose at some conceptual level it would be harder to ignore an error in a compressed file than a plaintext file, though you'd hope this is the sort of thing that RAID error checking would catch and fix to begin with.
Well implemented compression, such as lz4 on ZFS, can increase performance, because the slowest part is reading and writing data to disk. Reading a small amount of data and decompressing it (or compressing and writing) is faster than reading a large amount of data, because processors are so much faster than disks.
  #19  
Old 08-16-2019, 03:34 PM
pmwgreen is offline
Guest
 
Join Date: Aug 2000
Posts: 390
As a simple space saver, can you simply zip up whole directories? The wasted space is per file, so even if you used no compression, the transformation of many files into one should save an enormous amount.
  #20  
Old 08-16-2019, 07:05 PM
Reply's Avatar
Reply is offline
Member
 
Join Date: Jul 2003
Posts: 8,623
Quote:
Originally Posted by echoreply View Post
Well implemented compression, such as lz4 on ZFS, can increase performance, because the slowest part is reading and writing data to disk. Reading a small amount of data and decompressing it (or compressing and writing) is faster than reading a large amount of data, because processors are so much faster than disks.
Yeah, while there are circumstances where this may be true, for the OP's archival purposes (write once, read very rarely) it probably won't really matter either way. The performance change should be negligible, faster or slower.

Quote:
Originally Posted by pmwgreen View Post
As a simple space saver, can you simply zip up whole directories? The wasted space is per file, so even if you used no compression, the transformation of many files into one should save an enormous amount.
Filesystem level compression with a saner block size is better for this. Zipping up whole directories makes it harder to do things like search indexing, online backups (a lot of providers still don't do delta syncs on files), etc. It also makes that one huge zip file more vulnerable to data corruption, potentially affecting multiple files at once. Though, again, it's hopefully less of an issue with a good RAID array.
  #21  
Old 08-16-2019, 07:33 PM
Atomic Alex's Avatar
Atomic Alex is offline
Guest
 
Join Date: Sep 2009
Posts: 2,861
Quote:
Originally Posted by Reply View Post
If by "tons of stuff you'll read later" you mean tiny text files and HTML pages, file compression might make a huge difference for you (40-50%) at minimal performance loss, since you're mostly just hoarding and rarely reading them anyway. If you actually meant millions of images, like porn, file compression won't help.
Yes its mostly text files, pdf and internet pages, I'm not a guy and porn isn't really my thing (the visual kind certainly and the written kind rarely).

Thanks for the answers everyone! I've reformatted with NTFS and I'm making the backup again.
  #22  
Old 08-16-2019, 11:17 PM
DPRK is offline
Guest
 
Join Date: May 2016
Posts: 3,479
That is an impressive collection; reading a million or so books or web pages sounds like it would keep anyone occupied for a while.
  #23  
Old 08-17-2019, 02:39 AM
Melbourne is offline
Guest
 
Join Date: Nov 2009
Posts: 5,189
Quote:
Originally Posted by Reply View Post
If you switch to NTFS, you can also turn on Window's native file compression: https://www.howtogeek.com/133264/how...might-want-to/

If by "tons of stuff you'll read later" you mean tiny text files and HTML pages, file compression might make a huge difference for you (40-50%) at minimal performance loss, since you're mostly just hoarding and rarely reading them anyway. If you actually meant millions of images, like porn, file compression won't help (jpeg is already compressed). This should work with RAID too, though I suppose at some conceptual level it would be harder to ignore an error in a compressed file than a plaintext file, though you'd hope this is the sort of thing that RAID error checking would catch and fix to begin with.
OP should convert each web page to a compressed single-file format like MHT or MAFF.... but it's prpbably too late for that.

The native NTFS file compression only compresses files. For small files, you don't get much benefit from compression because each file is still at least 1 cluster in size.

I wonder if the exFat system is reserving space at the end of each file?

Last edited by Melbourne; 08-17-2019 at 02:40 AM.
  #24  
Old 08-17-2019, 04:00 AM
Reply's Avatar
Reply is offline
Member
 
Join Date: Jul 2003
Posts: 8,623
Yeah, good point, MHT would've been great.
  #25  
Old 08-17-2019, 08:18 AM
rbroome is offline
Member
 
Join Date: Jun 2003
Location: Louisiana
Posts: 3,466
Nice thread.
I am sure there is a lot to learn about file systems!
Someone should start a thread about what are the most common choices, what are the the best choices, how do file systems work, there are lots of questions!
  #26  
Old 08-17-2019, 03:21 PM
Atomic Alex's Avatar
Atomic Alex is offline
Guest
 
Join Date: Sep 2009
Posts: 2,861
Quote:
Originally Posted by DPRK View Post
That is an impressive collection; reading a million or so books or web pages sounds like it would keep anyone occupied for a while.
Oh I don't kid myself that I'm ever going to read a fraction of it, but I tend to get interested in a subject and devour that for a while before moving onto something else, and I'm never sure what will catch my attention next.

Quote:
Originally Posted by Melbourne View Post
OP should convert each web page to a compressed single-file format like MHT or MAFF.... but it's prpbably too late for that.

The native NTFS file compression only compresses files. For small files, you don't get much benefit from compression because each file is still at least 1 cluster in size.

I wonder if the exFat system is reserving space at the end of each file?
Well I've already started transferring it, I'm not actually short of storage space, its just the exFat system that was causing concerns!

On a side-note though does anyone know why Firefox (yes I know other browsers are available) doesn't always save webpages correctly? I have to go into the 'library' tab and download them again before they'll save with all the information.

Thanks again
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 04:07 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2019, vBulletin Solutions, Inc.

Send questions for Cecil Adams to: cecil@straightdope.com

Send comments about this website to: webmaster@straightdope.com

Terms of Use / Privacy Policy

Advertise on the Straight Dope!
(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Copyright 2018 STM Reader, LLC.

 
Copyright © 2017