So what's the latest on Image Comparison software?

I don’t have a particular need for this right now, but I could see where this would be really great software.

Is there any software out there that can reliably compare image files looking for images that might be the same image?

Say for example I have a bunch of photos, but for some reason they contain several duplicates but with different file names and different resolution. Is there a way to find ‘duplicate’ pictures?

DupDetector works well and is free.

GQview does it pretty well for Linux, and it has a windows version but I’ve never used it.

Another good Linux program is imgSeek, although it’s an old program that may not be being developed, and there’s no windows version. It finds duplicates pretty well but it has a cooler feature, which is sort of related. It has the ability for the user to draw a small sketch and have the program search it’s database for any image matching that sketch.

Here is a great example. I wanted to find the photo of me and my daughter walking in a parking lot. I remembered that I was wearing green and she was wearing pink, so I drew a green line with a pink line next to it. Out of 756 images (approximately 100 have duplicates that have been resized), it came up with the 9 results on the lower left (actually, it came up with 5 different images, 4 are resized duplicates).

I don’t know the state of image comparison software in general, but imgSeek is a 3 year old program and it can do that…

I’ve been using Duplicate Finder. It works well. It can also delete the duplicates en masse for you, which is handy. The free version will only delete 50 files at a time, though. I don’t mind using it that way, but if it bugs you you can always pay for the uncrippled version.

That’s amazing.
Do any of you know the logic in laymen’s terms behind it?
Does it do some type of histogram that it runs a comparison on?

My guess would be that internally, it compares photos in much the same way. It just sort of blurs things together and compares regions for general color similarity. For the draw-and-find bit, it probably has a “transparent” color which means “don’t check that area”, but otherwise still does the blob testing in the areas with a color defined.

I doubt this is really a significantly complex programming task, unlike voice recognition or face recognition or something.

I have imgseek on a Windows PC, so it was once available for Windows. Although it was maybe six or seven years ago that I downloaded it. It worked well then, as I recall. Maybe you can still compile it for Windows.

It is based on this technique and does more or less what Sage Rat suggests, in a way. It’s not very complex, but it actually has a lot in common with some facial recognition techniques. But working with 2D images that don’t change, it doesn’t need to be nearly as complex.

I dunno, but I browsed through their sourceforge download page and found that version 0.8.5 has a windows version for download.

Thanks for the pointer to ImgSeek Fubaya… it’s an excellent piece of software. I haven’t tried it with Windows, but I thought I’d point out that it is available in the software repositories for Ubuntu 8.04 Hardy Heron, which makes installation MUCH easier. Not sure about other distributions though.

How fast is it when you have about 20,000 pics in 500 folders? Which one will handle that reasonably well?

Available for Intrepid Ibex, too.

imgSeek has a lot of features so I don’t know how it compares to a program that is designed solely for finding duplicates but it’s fast because it is only comparing the signatures in the database. Building the dadabase takes a couple minutes with my 3000 photos and would obviously take some time with 20k, but finding duplicates is nearly instant.

Out of wonder has anyone tried to integrate this into a search engine yet or is it still too unreliable? Because I was just thinking last night how cool it would be to have something like Google Images where you could actually put in an image and find:

  1. What sites that exact image is hosted on.
  2. Images very similar to it (i.e. unphotoshopped versions of your chosen image)
  3. Whatever the hell the image recognition program throws in there for lulz because the software is in its infancy.

Something like this?

I remember another service like that which seemed better, but I can’t find it right now.

TinEye. There is even plugin integrating it with Firefox, although still in beta. Working great for me, anyway.

What do we have for the Mac?

===found it===

This is one I’ve been using for a few years. Pretty quick, can handle 100+K images at once (though the more inspected the slower it gets)