Is there software that compares pictures?

Let’s say you like pictures of cats. You collect them from all over the web (free, of course). You’ve lost track of the pictures you already have.

You find a new picture, but you think you may already have it, but you don’t want to go searching through hundreds of pictures to see.

Is there software that could do this? You show the picture to the software and it searches the specified directory for it’s match.

Is this possible or is it Hollywood fantasy?

Well, iPhoto does that automatically when you add pictures. But I imagine you’re on Windows. I’d guess that most photobook applications will do this, considering how easy it would be to implement.

To clarify: Are you looking for identical files, or equivalent pictures? The former is not too hard, but the latter is an AI problem. Suppose, for instance, someone puts up the same picture, but with a small logo in the corner identifying the website. Are you considering that the same picture? Or if they crop or resize the picture?

Of course, one might also point out that if you have so many pictures that you’ve lost track of which ones you have, does it really matter if you already have a particular picture? By that point, you’re unlikely to go through your entire collection viewing all of them.

Lots of pictures of…“cats”…downloaded from the Internet. Right.

It’s certainly more difficult than checking for identical files, but I’m not sure I’d call it an AI problem. Checking for image correlation is a pretty simple mathematical function, and a simple threshold would account for lossy recompression, digital tags and wotnot. You could probably make it amazingly quick, too, if you used some sort of fourier analysis technique, and just generate a unique ID for each picture.

As for existing programs that do this, at the moment I can only find ImageMagick, which has a command-line tool that computes a kind of difference function on two images and spits out a third image showing where the originals match. Not really what you want. I’ll have a proper look around, since this seems like such a trivial thing to do that I’d be very surprised if there’s nothing that does it.

It’s not too hard to generate a correlation function between images of the same size and field, and that could probably work well enough for differing corner logos or watermarks. But now suppose that two images are cropped differently: In order to get the appropriate corellation function, you’ve got to overlap them correctly, and even if you do that, it’s ambiguous whether you want to consider those the same image or not (they might be two different close-ups of a large scene, focusing on two different cats, or they might be one cat with different amounts of background). One of the images might be resized, or even resized and cropped. You might have things like captures from adjacent frames of a movie, or shots at the same time from two different cameras next to each other, which are probably going to be close enough that a human wouldn’t want copies of both, but which might not look the same to the computer. And even if you do determine that two pictures are functionally the same (though not identical), how do you decide which one to keep?

I didn’t think it would be easy. I was just curious if it could be done.

Thanks for the input.

I’ve been using Duplicate Image Finder from Running Man Software (http://www.rmsft.com) for a while now, and it works very well. It will find matches of different file sizes, resolutions, similar but not exact duplicates, etc. It also keeps a database of analyzed images for future comparisons. You can also specify multiple paths to have it search for duplicates. All in all, it’s really a pretty full featured program. You can download a 14-day trial version and if you like it you can buy the registration key for $29.

SC

There’s also imgSeek, which is available free for Windows and Linux. It’s still in beta, but it works quite well. You can search either by comparing to a file, or by drawing a sketch of the image you’re looking for on screen (the former works better than the latter, it has to be said).

Of course, if you have a spare $100K lying around, you could buy MediaBin, full-blown enterprise Digital Asset Management system that includes among its many features one called Content-Based Image Recognition (CBIR), a patented technology that allows you to select an image and find the other images in the database that are most visually similar to it. This is done by building a set of mathematical descriptors of the visual content along different axes (distribution of color, orientation of edge detail, etc.) for each image as it’s added to the database and then comparing that to the other images when such a search is performed. It’s based on work originally done for the DOD on target recognition.

I’d include the standard disclaimer about having no connection to the product, except that I’m a employee of the company that produces MediaBin, and spend most of my time traveling around the country implementing the product at major corporations.

Doing this sort of thing is conceptually fairly easy, but the quality of the results is obviously highly dependent on the quality of the algorithms that build the descriptors – ours are the result of a dozen or more years of work by several math Ph.D.s and image scientists.

If you have Photoshop, you can use the “Calculations…” item in the “Image” menu to compare two images: open both, select “Calculations…”, select one image as “Source 1” and the other as “Source 2”, select the gray channel for each, select “Difference” in the “Blending” section, and specify “New Document” in the “Result” section. You’ll get a new file that will be black wherever the two images are the same, and varying shades of gray where they’re different. If the images are different sizes, however, you’ll have to resample one of them to match the size of the other first.

I downloaded this and started fiddling around with it and I must say I’m impressed. I’m definitely going to buy it.

Thanks for the help.

I’ve been using Unique Filer for years, and have always been impressed with it. I haven’t tried any of the others mentioned so far, though, so I couldn’t say how it does comparitively.

After you get all your cats organized, could you please post links to a couple of your favorites? I bet many of us would like to see some nicely organized cat photos. :smiley: