After a decade of resistance, I’m finally starting to burn my CD library to my computer with Windows Media Player. After about a dozen or so random choices, covering years from 1962 to 2006, WMP has so far selected the correct artist, album, and cover art for every CD I’ve ripped. (Obviously the earlier years were published on CD starting about 1989, long before Media Player was a gleam in Bill Gates’ eye.)
Almost every published CD has an identification code that Media Player can read, and then it looks it up in an online database. Pretty nifty system, really.
The identification is done based on some characteristics of the data on the disc. All that information (title, album art, etc) isn’t actually on the disc, but there are some things you can look at on the disc that can be combined together to make a unique identifier. I don’t know exactly what factors WMP uses for its identification, but one old method is the “CDDB ID”, which basically calculates a number by taking into account the lengths of all the individual tracks found on the disc (an explanation of how to calculate it is here). This number is used to look up the info in a big database. This would result in occasional mixups when two different albums would end up with the same track lengths or there would otherwise be a “hash collision” (a computer science term), so I’d guess they throw in some more info into the calculation to guarantee a little more uniqueness these days.
Huh. Really, just . . . huh. Coincidentally, I’m in the middle of burning the 2-disc you? me? us? by Richard Thompson, and the fact that I accidently put disc two in first didn’t confuse it at all. Kudos to Microsoft, I guess.
If you think that’s impressive, the Media Center is going to blow you away. I had set it to record a TV show recently but it didn’t get started because of a driver conflict, so I browsed through the program guide to see if there would be a rerun of the episode coming up and when I found it I saw that it had already been automatically set to be recorded in lieu of the one that failed. It’s almost uncanny.
Nitpick: You rip CDs to your computer. Burning is going the other way.
In my experience, WMP gets them right more often that not, but it’s not that uncommon (especially with more obscure discs) for it to either misidentify a disc or not to recognize it altogether. In cases like this, there’s the option to enter the info manually, in which case I guess it gets sent to the Big Master Database for the next person to find?
I listen to a lot of live material (Dead, Parliament, etc.). I remember being pretty blown away several years ago when WinAmp started recognizing shows, down to date, set number and set list. I added a few, but the database caught up pretty quickly – even recognizing different versions/sources (e.g., audience, sound board) of the same show.
I have well over a thousand hours of material and haven’t seen it ask for information for a looonnnng time. Years. Seems tapers are uploading details along with the tunes, so there’s no wait.
If you’re downloading this music from the internet after other people upload it, it’s probably in a format that includes the metadata, which would explain why WMP knows what it is. That’s a different issue than inserting a CD and having WMP recognize it.
For some material, sure. But this all started back during the birth of Napster when shows would crawl across as a collection of MP3s, WAVs, or other lossless formats. There are MD5s and text files to accompany them, but the discs themselves are made up of a collection of burned WAVs. No metadata as far as I know, and when I burned discs I never added song titles or any other information during the process (I’d created an Access database to track the collection and print labels).
'Course, since I’m mainly going on seeing the CDDB window popping up in WinAmp and assumptions about what information individual WAV files contain, I could easily be mistaken.
In the early days of the internet and mp3, thousands (millions?) of volunteers uploaded artist names, song titles, albums, etc with the appropriate hashes to create the CDDB. These people thought access to this database would remain free forever. It turns out that that wasnt true and it was later sold and renamed Gracenote. Companies like Apple and Microsoft pay to access Gracenote so that WMP, iTunes, etc know what the songs are which when you put in a CD. Later, a more open version of Gracenote was launched called freedb. I dont think many commercial players use freedb as its not as high quality as Gracenote.