Should I check "compress files to save space?" on my computer?

In Windows XP, when I right click on my C-drive and go to Properties, there’s a check box for “Compress Drive to Save Space?” How much space exactly would it save, usually, and is it a good idea? (forgive me if this is a stupid question)

Also, how often should I do a “Disk Clean Up”

No on the compress files, especially at the C: level. You can compress a particular folder if you like (open My Computer, C:, the folder you want to compress, right click, select ‘advanced’, check ‘compress folder’), but it does add CPU overhead everytime you read or write a compressed file.

Disk cleanup every few months, IMO. No set rule on that, since it varies depending on system usage.

NOOOO!!! :eek:

Eeek! Thanks for the warning! :wink:

Disk compression is something who’s time is long since past.

In the early 90s when PCs finally hit it big disk space was expensive and was not increasing fast enough to keep up with software size (or to be honest, with Windows’ software size :)). So for a while software disk compression was a viable option.

Microsoft finally bundled their version, DoubleSpace, right with DOS/Winodws 3.1. And was promptly sued for having basically stolen it from Gibson Research’s Stacker. They paid a hundred million dollars or so and re-released it as DiskSpace.

Anyway, software disk compression is completely obsolete today. It will never give you more than a 50% increase (if that) and it will always incur more than a 50% performance hit.

As an example, my first two PCs doubled in hard drive space. But each time after that they increased by a factor of ten!

[ul][li]1990 - 80MB[/li][li]1991 - 170MB[/li][li]1994 - 1.6GB[/li][li]1997 - 13GB[/li][li]2000 - 130GB[/li][/ul]

No is indeed the correct answer, some of the rights reasons have been given and one wrong reason so far.

The two best reasons:

  1. Compressed drives are much harder to restore after a file system glitch. Your power goes off during a file write and it’s a lot easier to lose everything.
  2. The biggest files you have are probably already in a compressed format: movies, mp3s, etc.

The wrong reason given so far:
CPU overhead. Actually, there is a win here. Data transfer from disk is tremendously slower than the time to compress/decompress a file. So if you are like most people and the CPU is idle while waiting for a file transfer, you actually speed things up moving a nicely compressible file. A file 50% smaller takes 50% less time to transfer. And the extra processing time is virtually nothing compared to that. The CPU overhead is only a factor if the CPU is already heavily in use by another app (which is uncommon under MS-OSes) while doing the file transfer.

Gotta take issue with that. Presuming Guin is using an IDE drive, there is always CPU overhead when accessing the drive. Sometimes less, sometimes more, but especially when moving multiple large files, it can be significant. Couple that with the overhead from compressing/decompressing as needed, and you can take a performance hit. (One that can be cured by popping in a $5 zillion terabyte drive, or whatever they are going for these days.)

That’s a drive with a zillion terabytes, which costs only five dollars, not a drive that costs five zillion dollars, and holds a terabyte of data, right? :wink:

It should be noted that there is a difference between “compress old files” in Disk Cleanup, and and disk compression schemes like DoubleSpace. “Compress old files” only compresses files which have not been used recently, so it is unlikely to cause disk access delays in daily usage. If you are having disk space problems, I say go for it. If not, it is probably not necessary.

Are you sure you keep your arguments consistent?
Yes, there is a certain CPU overhead when reading from the disk, but it gets reduced by enabling compression as you pointed out yourself ("…especially when moving multiple large files…") Add to this the significant time wasted waiting for the drive and you have a nice positive effect of compression.
A negative effect is the additional CPU overhead for decompression, but it is fairly small (For writing it is slightly bigger than for reading.)

Yes. However, I can’t vouch for other peoples (incorrect) interpretations.

Indeed there is.

Are you familiar at all with NTFS compression? You seem to think it will result in some magical huge reduction in file size. I hate to break it to you, but other than text files and whatnot, your savings will be minimal. Executables and many other file types don’t compress at all. But even if there is zero compression, if the file has the attribute, then the de/compression is run on it. Try it out. Compress a .exe and watch task manager, you’ll see that nice CPU blip, even though it doesn’t actually save any space.

So under compression, not only will you still be moving multiple large files (if you are lucky, a bit less large, but most likely not much), and you have the added overhead of having to compress/decrompress as needed.

And you will be waiting longer, since IDE takes CPU cycles, some of which are being used to run the de/compression.

Sure. We aren’t talking a 5 minute performance hit, we are talking a second or less, generally. (Well, a lot longer during the initial compression.) But that performance hit adds up. Why degrade the performance of your box at all? Hell, get a G-mail account if space is such an issue.

Compressing NTFS volumes is a an idea who time has come and gone. There may be use for enabling compression on a single folder or two, depending on what you plan on putting in there, but even that is doubtful. It just plain ain’t worth it.

I agree. That’s why I don’t do it.

I just got the impression that you used overhead from disk I/O (other than compression overhead) as an argument against compression. Sorry if you didn’t mean it that way.

Short answer: I know. :wink:

Sorry about the snarky tone, kellner. I was doing an offline defrag on a wigged out Exchange server with a a constant stream of “So, any ETA yet?” Yep, farkers, when it’s done. Next time, gimme the weekend downtime…

I just want to respectfully disagree with Brutus’s objection on CPU overhead. The load on a CPU to read/write an uncompressed file is nearly zip. Good grief, I had an IDE card in my old 10Mhz XT. CPUs have gotten a lot faster while drives have sped up far less. Note in particular how the speed and width of IDE buses have changed very little over the years.

Watching a CPU overhead graph while doing file I/O can be quite misleading. E.g., merely doing a task switch can cause a noticable spike. Anytime the usage pattern of disk or memory changes, there can be a sudden surge in activity relating to paging. None of these are caused by compression or can be avoided by not using them.

Also, normal executables are quite readily compressible. Some packages you download off the net are in a executable/selfextracting/compressed format. But the normal programs themselves, once installed are usually not compressed.

As for people like the OP who may not know whether to believe me or Brutus, let me add the following:

  1. My research appears in standard undergrad OS textbooks.
  2. I have taught file processing and such more times than I care to think about.
  3. I have published papers on data compression.

I am sure Brutus can likewise provide relevant background info.

I agree. General compression algorithms like ZIP, RAR, or NTFS’s builtin compression can easily compress most executables by at least 2:1, and compression designed for executables (like ASPack or UPX) can do even better. UPX can crunch my 3 MB app down to about 700 KB.

I only use NTFS compression when I’m really low on space, and Windows runs Disk Cleanup, which suggests compressing my older files. Those are files that I could probably delete anyway and wouldn’t even miss. In most cases, it’s easier to reclaim space by burning some videos or MP3s to a DVD+R and deleting them from my hard drive.