Recovering from a hard drive failure

Executive summary: This is actually a pretty simple question, there’s just a lot of context I felt obliged to include.

The computer I use to store digitized music has presented me with a unique upgrade opportunity, which is another way of saying one of its hard drives is failing.

On the upside, I was already taking steps to retire this device and I have the replacement hardware ready to go. Also, I have backups of the hard drive, though they’re a little older than they should be, so I’ll wind up having to replace some lost data the hard way (re-ripping CDs, downloading some stuff, etc.) Last but not least in the win column, I ran daily directory listings of the biggest folder, so I’ll be able to know exactly what’s missing when the time comes.

I ran testdisk and photorec to create an image and to recover files. So far so good.

Here’s where the plot thickens: I now have some 80K files that have photorec’s arbitrary filenames attached to them. Trying to figure out the simplest way to compare the photorec output with my restored backup so I don’t have to listen to every one of those 80K files to identify which ones I can toss. Methods that have occurred to me are to try to figure out a way to use md5sum in bulk or perhaps rsync?

Thanks in advance.

FWIW, the original device was a PC running WindowsXP and the new device will be a RaspberryPi running (probably) LibreELEC/Kodi.

So, you want to scan for duplicate files in a directory tree, under Linux. There are a few relevant utilities; try perhaps FDUPES.

Computing an md5sum for every file in the system, then sorting for duplicates, will work too, etc.

Maybe dupeGuru

So, PhotoRec has recovered a presumably intact file, but with no directory info you don’t know the file name?

If these files are say MP3’s, you can examine the structure with some sort of binary editor, or simply rename the file .MP3 and see if it plays.

You won’t have the original file name, but the MP3 meta-data will tell you what the track is.

Thanks for these suggestions, everyone! I had not heard of fdupes or dupeGuru before, and they both seem like good paths to success for me.