Sometimes in the movies or TV the cops have a recording of a call for help or a ransom demand, and they take the recording to a sound engineer to electronically separate the sounds to identify the background, thus pinpointing the location of the call.
I don’t see how this is possible, being that, over a phone line, there is only one " sound track".
Other than isolating frequencies, what do they do…or is it just Hollywood?
Well, at the low-tech end you can filter based on frequency. For example, if you have a recording of a shrill crying child over a low mechanical rhythmic humming noise, you can use a low-pass filter to eliminate the sound of the child. That may make the remaining sound recognizable to the human listeners.
As another example, consider an ordinary radio. There’s only one “soundtrack” of electromagnetic energy tickling the antenna. Yet somehow the radio can sort out all the noise from outer space, the sun, power lines, and a thousand man-made transmitters of various types to “recognize” the signal of yuor favorite 101.5FM Easy-Listnin’ Country Classics.
Picking a range of frequencies out of a signal is late 1800s technology. IOW, a well-solved problem.
At a higher tech level, you can do Fourier analysis of the sounds. This produces a map of frequency versus intensity. By comparing this map to maps of other known sounds you could identify what the sound (and its source) is. Or at least rule out a lot of possibilities.
Modern ultra-tech digital signal processing (DSP) can do amazing things to detect patterns in noisy signals and mathematically “filter” them into their constituent parts. You could take a monaural recording of an orchestra and separate it into tracks for each instrument type. That’s takes real expensive gear, but it can be done.
With nothing more than a simple PC and a sound editing program, you can isolate in time as well (in between speech can be background sounds). Try working with a sound editor sometime and see what can be done. It takes some skill, but it’s not all Hollywood.