Way back when MTV made its debut and I was first hearing about this phenomenon of visual accompaniment to the soundtrack of an audio recording, I somehow imagined that what they’d do is really represent the sounds visually. Of course that’s not what MTV, or subsequent music videos, were about.
That doesn’t make it a bad idea.
I’m not fascinated by soundwave representation; I deal with it all the time when editing sound files, but it’s just not evocative. What I’d rather see is…
• Color-code the blasts of color on screen according to what KEY they’re playing in. e.g., C major is blue, D major is red, E major is green, etc… for all 13 major keys, with variants for their associated minors. And additional variants for 7th major and 7th minor and 9th and so on commonly recurring combo chords.
• When the chord being played doesn’t reconcile to a single major or minor or 7th or 9th or whateverish chord, superimpose the colors.
• Melody-line tracks in similar colors as a dot superimposed upon the colorswatch, based on what note is the predominant melody note.
Or something like that. Back to Factual Questions: Does there exist software that can represent audio in this fashion, loosely speaking?
In true SDMB fashion, I’m going to be slightly pedantic about this. There is a difference between the key of a song or section of a song and the chord that can be defined (in simple terms) by the sounds happening moment to moment. It sounds like you don’t want the key but rather dynamic chord/harmonic analysis.
It does?? (Opens VLC media player, fishes around in menus) … can’t find where to invoke such things. All the “video” items I’ve examined seem to pertain to playback of the video portion of movie files.
Coloring by key has several problems. First, you can’t define what key a single sound is: Any given note shows up in 7 out of 12 major keys and 7 out of 12 minor keys. When a song starts off with a single prolonged note, what key do you assign it to? Do you just wait until there are enough other notes to tell, and color it retroactively? Second, it’s fairly typical for an entire song to be in the same key, in which case you’d… just have the screen showing that same color for the entire song?
There’s also the issue that an image has more than just color; it also has two dimensions of position, where sound has zero positional dimensions. But on the other hand, human vision has lousy spectral resolution, with only three channels, while sound has, effectively, a full continuum of spectral resolution (or, for the sounds generally classed as “music”, at least as many channels as there are notes). There are ways to map the sort of information that sound has to the sort of information in an image, as you allude to with wave representations in sound editing, but it doesn’t “look the same” as the music, in the way that you’re trying for.
What comes to mind is a “color organ” like the ones sold in the 70s. There were several versions, some that hooked up to your speaker terminals, others had a microphone. I think they were limited to 3 channels. There are kits available today and I suppose someone with a little tech savvy could boost that capability.
MilkDrop-style visualizations are just supposed to look cool, and run in real time, so it does not matter exactly what it is doing. Incidentally, an arbitrary piece of pop music may not even be in a well-defined key, strictly music-theoretically speaking (also, at some level that has to do with how the song sounds to you and how you interpret the harmony; it’s not all purely mathematical).
Now, I am not sure if the OP is interested in a psychedelic music visualizer or something in a DAW that would enhance the usual amplitude-envelope visualization (graph of the amplitude as a function of time) with some kind of polyphonic pitch detection. There is absolutely published code that grabs a FFT and tracks what “key” best matches the spectrum, or what chord is playing at any given moment [look at some of the MIREX stuff], but I am not sure if any of them are already integrated into a DAW plugin with the desired color coding.
For what I find a remarkable and useful visual representation of the music, this is my favourite. Considering the work visualised, it is quite something. This isn’t going to happen automatically.
I’m realizing I did not have a specific algorithm in mind. At any given moment the sound at that point could be considered a “chord” which could be analyzed and recognized, but there’s a lot of arbitrary decision involved when someone does that, the more so as you increase the number of “incidental” tones that are considered to have been “added to” the fundamental “chord”.
The actual visualizations in the harmonic coloring videos appear to be taking the color organ idea to the extreme, with channels a half-tone wide and also capturing attack/sustain/decay. I don’t believe there is any change in the colorization of a particular note based on key, the magic is in the assignment of colors to each note so that the overall color palette of a section of music changes with key. The point of the visualization is to assist in the identification of key changes, but not to actually identify the key.
I think that generating such a visualization in real time is certainly doable, although strongly inharmonic instruments could cause problems.
Here’s a visualization of The Great Curve, from Talking Heads’ Remain In Light. Although it is a deep cut by the Heads’ standard, aficionados of the band seem to rate it very highly.
It obviously takes as its starting point the pedestrian audio track visual model, but it does do a great job of explicating the complexity of what is basically just a one chord jam with layered vocals (and wild guitar solos courtesy of Adrian Belew).
To radically shift gears, I can think of two classical (well, 20th Century) composers. Scriabin and Messiaen, who worked towards a theory of color visualization, though neither codified it into anything like a unified theory.
There’s nothing off the shelf I’m aware of - Melodyne’s DNA editing is closest, but that GUI is based on the standard dynamic waveform display, just using multiple instances spread across a piano roll to indicate the pitch in polyphonic program material.
It’s an interesting idea that happens to dovetail with a musical/sonic feature extraction project I’m working on. So, I played around with the concepts a bit yesterday and came up with a method and rough prototype to extract labeled chords (confidence varies) and beats, so the chord “blobs” could (not are, could) be animated along with the audio track.
Next would be to figure out color-coding for the diatonic behavior, which is where things can get tricky, due to evolving nature of Western Common Practice. After that, melodic extraction and analysis…but, those features are beyond the scope of what I’m playing with.
Factual Answer: not currently in existence, but could be done.