How do DJs, remixers, etc. get isolated vocal/instrument tracks?

It seems to me that usually when you hear one of these “dance mix” or otherwise remixed versions of a popular song, created by some DJ or electronic music producer, they will either have some singer redo the vocals or use a sample from the original that you can tell is just a clip of the original as heard on public release, just as if I had loaded the song into some cheap sound editing program on my computer and clipped out a key part of the chorus or something.

Occasionally, however, you’ll hear one that really sounds like it uses vocals or some instrumental track from the original song, but without all the other layers. I remember a couple of years I did some brief Googling on the subject and learned that it was impossible with software to strip out individual tracks from a song. So I figured that maybe if you’re a big-name enough DJ you can buy individual layers from the record company for the purposes of making a remix and maybe that’s how they do it.

Here’s an obscure example, though, of someone I assume is not a big-name DJ. Some guy has created a “chiptune” version of Taylor Swift’s “Love Story” which sounds to me like it uses Taylor Swift’s original vocals. It reallly sounds like Taylor Swift’s voice, yet I don’t hear any guitars, drums, etc. from the original song. How could he have done this, without paying a hefty sum of money to Swift’s record label?

Usually, those remixes are commissioned by the artist/label themselves so they provide isolated vocal tracks to the remixer for the job. A lot of artists also hold remix contests on the web and provide the vocal tracks for their fans to use. This site compiles remix contests, for instance.

Alternatively, on the very rare occasion, you can get them through other means - Depeche Mode released remastered versions of all their albums a few years back. Each had 5.1 surround versions of the tracks and some of those tracks had the lead vocals isolated to the center channel. Fans were ripping the center channel only and using that for remix purposes.

Two thoughts on this, and they are complete WAG and I look forward to hearing the real answers. My first thought is that I’ve always wondered if it’s possible to pick the tracks apart photoshop style. Sure, you can, as you mentioned run the song through some filters and get the vocals out the other end, but you’ll still hear other instruments in the background. Similarly, you can take a picture and spend 10 seconds lopping out the background, but you’re still going to leave a good bit of it around the main subject. Now, with photoshop, you can tediously go around the edge of your main subject, pixel by pixel and decide what is bg VS fg, can you do that with audio? Is it possible to go bit by bit through the audio track and make a decision like that? I’m thinking not, but maybe?

My other thought is, what happens if you run it through autotune? I assume a song like that has been autotuned, which means, in theory, putting the vocal track through auto tune again wouldn’t do anything, right? So if you stripped it down as much as you could, maybe did some other things to bring out the vocals as much as possible and put the result through autotune, could it conceivable get rid of everything else?

ETA FTR, I’m guessing autotune would average the vocals with the other sounds and end up coming up with something really wonky anyways, like I said, just a WAG.

Producer/Engineer here…

3 main ways for remix source parts:

  1. Official source

  2. Studio Trickeration

  3. Leaks

  4. As a mix engineer When you deliver product to the label (or the artist or mastering engineer, depending on the job) you generate a number of mixes for delivery…how many and which type vary by artist and label. The common ones are: mix master, vox up, vox down, radio mix, club mix, and stems mix. Sometimes you will also see drums up/down or no fX mixes.

Master mix: the main product
Vox up: vocals up 1-2 dB
Vox up: ditto but opposite
Radio mix: sometimes shortened; profanity muted or masked with a 1 kHz tone, q checked for extra mono compatibility and for quality post radio limiting.
Club mix: boosted bass/kick elements…often w/ addt’l LFE or subbass tones, mixed for impact and low-mid frequency response enhancement .
Stems: you often mix to ‘stems’ and are often asked to deliver them as well, especially when passing right off to an ME. Stems are submixes…drums, vox, synths, gtrs, pads, etc…routed together for ease of mixing and the ability to apply fX…especially compression and bus equalization. Sometimes stems are printed to tape to get that nice analog tape compression/saturation and then mixed through analo console back into the DAW (editing software)

So theres a lot out there to be leaked or released…and with studio tricks you can often isolate parts easily with multiple versions…usually a combinations of phase/polarity inversion coupled with filter/eq limiting and some dynamic processors.
You can also recreate a lot of parts pretty easily…synths, drums, fX, guitars, bass. In a remix, the broad strokes of the part need to be right but the overdubs and new additions mask things enough that it sounds the same to Joe YouTuber.

Vocals are different, but are almost always panned dead center. Import the song, split the stereo file to double mono, flip phase on one side. Most panned sounds disappear or fade way back…except for those dead center. Double mono allows a lot of the time fX to stay in place…verbs/delays…which contribute a lot to the timbre of the performance as released.

Layer your new parts over, using the existing ghost parts as guides/ref points.
Easy as Pie!
Also, some artists/labels offer stems/tracks for mixing/remixing…Peter Gabriel and his label offer a ton of tunes by him and other artists on the label.

Many artists release acappellas (treated or untreated vocal stems) specificaly for remixes,esp. in Urban/Hiphop.

Last…sites like ccMixter offer remix sources , both beats/pads and such, plus homemade acappellas and some official releases.

Generating acapellas is legal and while an art, not some arcane magic…it’s legal to trade and share as they are considered legitimate derivative works for the most part, it’s only commercial release without licensing the source that gets you in the hot water of IP rights…

Recording engineer speaking. No, this is not possible. The reason is that sound, unlike a picture, only has one amplitude at any given moment. That you’re able to discern a voice, a piano and a guitar is one of the amazing features of your brain, but a computer cannot do it (yet). The piano and the voice might use the same frequencies at the same time, so there’s no way to know which is which. For about two years now, there has been a version of a program called Melodyne that can actually deal with polyphonic material (such as a single guitar or a piano) but even that won’t work on a complete mix. This was pretty revolutionary at the time. It is, however, not completely unthinkable that with a couple of years of research something like that could be feasible.

Also, autotune isn’t something you “put a track through” like a meat grinder, it’s a tool that you use to iron out certain imperfections in a given track, much as you would use Photoshop to remove wrinkles from a face, which doesn’t mean you can’t “Photoshop” it again. Autotune just deals with monophone, dry signal (meaning only one note at a time) and is utterly lost with anything else.

Begging your pardon, but wouldn’t that achieve the exact opposite, meaning getting rid of the center?

Good catch! That’s how you isolate the panned elements… My mistake!
If you duplicate the stereo track, and each carries one side of the stereo image (ie one is all L, other all R)…that’s a little different. Again your cancelling center but b/c it’s two halves (one flipped) the cancellation is very different.

So you’ve got an isolated track with no center once you bounce. Combine THAT and flip with a regular image and then you’ll have a clean center. You can filter/eq out some other center elements on one side…such as kick and drums…at least to minimize their cancellation.

You can then process those filtered out sections (sidechained kind of like a de esser but with variable pass/notch/shelf what have you into an aux) and hit te resultant audio with some minor sample delay (3-10 ms is a good start) and some harmonic treatment…throw it through something that will alter the waveform…exciter, tube or tape sim (or the real thing), anything to introduce new sample level info and timing without changing the tone much.

It works pretty well but it takes some practice and tinkering to get solid results with any speed or regularity. I’m slammed with work now, but if this thread is still kicking in a week or so I’ll throw some examples up of the various things you can do.

I realize upon re read that’s not very semantically clear.

1.Create and bounce a No Center track…
2. Take your Source track, add a unity send to an aux.
3. On the aux filter/eq/carve out the tones you want to keep…run through some harmonic processing and microdelay. Bounce that as your Filler
4. match and flip Source and No Center
5. Mix in your filler, which will have most of the center tones/timbre you cancelled between the other two. The processing you did will provide just enough difference to not cancel.

It can be finicky…and even a few ms difference can really change things for better or worse.

It’s like recording Haas guitars…but more annoying and subject to a lot more issues out of your control!