Locate Duplicate files

Is there software available that will search your hard drive and locate duplicate files?

My computer file system is a bit of a mess! :smack:

Yes. What OS are you running?

Sorry. It’s Windows XP with SP2.

Google Windows “Space Hound”

I use CloneSpy. Powerful and flexible, and free to use (but donations accepted - and I already donated). I particularly like the various ways you can specify which duplicate to delete (e.g. if duplicate found, delete the one with the longer filename, or delete the one in this folder, etc).

This is good news, I did not realize such a thing existed. I will be using it whence I get home.

Can these things detect duplicate files on an entire home network? Cause that would be the greatest thing ever.

I have one called Duplicate File Finder from www.funduc.com which is freeware. It worked nicely.

Has anyone used any of these on large corporate type shares? 500Gb volumes? Across multiple volumes?

Finding the dupes is the easy part. Finding a way to safely and logically delete the duplicates is the hard part. On my own Windows machines I use Double Killer for most file types because it can automate the process of deleting one or more of the dupes. If I screw something up it is my own fault.

For images, I use Dup Detector because it can compare image files pixel to pixel to find duplicates regardless of file type or size.

On the corporate network where I serve as the server bitch, we were faced with a storage crisis before getting our new server. I couldn’t make any assumptions about which version(s) of a particular file to delete. So I used Treesize Professional, a very user-friendly means of visualizing (and copying) the structure of files on a drive.

While it would’ve been possible to use a command line solution like dir to just get the file listing, I needed to create a safe, nice and easy GUI based workflow that an intern could run to find large duplicated files on the corporate server (~1.5million files, ~250GB).

The output from Treesize was dumped into an Access database that ran a query and generated a report, from which he contacted the owner of the largest files to determine the correct course of action. We didn’t even bother with the small potatoes, unless it was an entire duplicated directory. I think Double Killer or Double Killer Pro could be set up to do the same thing, but again, it is really not the finding that is the problem, it is the fact that there is usually no logical means of determining which file is safe to delete, and this goes double when you are dealing with other people’s files.

We started with that, moved to Size Explorer pro, and now have a scripted process to send emails automatically with custom spreadsheets which we can send to all users on the volume, based on minimum total usage.

The problem is the number of volumes for us, and an inability to easily scan for actual duplicates. I’ve tried many of the solutions, but the number, and size of the volumes has made scans simply take too long. :frowning:

(40 or servers, with nearly 80Tb) :eek:

Of course, they don’t actually do anything with the files, and so many people have moved through the company (40K+ employees) that it’s often tough to find who really owns the file, or can make a decision.

Ah, the joys of cheap disk, and lazy users.

-Butler