How can a paper diaphram (i.e. stereo speaker) mimic virtually every sound?

I’m pretty sure that no matter how one plays it, a violin will always only sound like a violin. Same for a drum. And a trumpet, and so on.

So how is it that the paper cone of a stereo speaker can be “played” to sound like all these instruments – and virtually everything else in the audio spectrum, including voices and instruments playing together*? How does it do it? Why doesn’t a paper cone sound only like a paper cone, the way a flute sounds only like a flute?

This question has bugged me all my life.

  • Yes, I do realize that speakers have their limits in reproducing sounds, especially extra-high and extra-low ones, but you have to admit that they are pretty miraculous in what they can mimic. I also realize that, these days, the “paper” in speaker cones has probably been phased out in favor of membranes made from more durable plastics and such. But the essential design remain the same regardless of the materials, so my question stands.

The paper diaphram is designed to evenly be able to reproduce an entire spectrum of frequencies directly through vibration. Musical instruments are designed to reproduce a very small (sometimes discrete) subset of frequencies unique to their construction, acoustics, materials, etc.

Take the violin for example, the string is a fixed length and can only produce standing waves of certain frequencies. The peculiarities of the wooden construction alter those particular frequencies in certain specific ways. The string is not actually oscilated at a frequency, it does so naturally through the action of the bow, the sound is intrinsic to the violin.

Take the speaker for contrast, the paper cone is designed to vibrate the air around it. It is directly vibrated at the specific frequency that is being reproduced. It’s ability to reproduce frequencies is a fairly flat graph (compared to that of musical instruments) that tapers off at both ends.

Hope this makes sense.

Because musical instruments are essentially resonant systems; they produce their sounds from their natural vibrations and harmonics, and the exact tonal quality is determined by the instuments’ air cavities and other filtering mechanisms, e.g., a guitar’s hollow body. The speaker cone, on the other hand, is specifically designed to have as little natural resonance as possible. It is driven, or forced, to follow the shape of the waveform fed to it.

I think it’s a good question. There has to be more to producing a sound than vibrating a cone to match a frequency.

What exactly are the components of a sound? I know of loudness (there’s probably a technical term) and frequency (10hz-22khz human hearing). But there has to be some other components. A cow bell resonating at 10khz sounds different than a flute note of 10khz.

So what are these other components and how are they represented by vibrating a paper cone?

And for that matter, how are they given a numerical value that can be transmitted as a digital signal?

The third component is known as timbre, and it is essentially the shape of the sound wave as it repeats. It is traditionally understood to be the sum of all component sinusoidal frequencies as determined by a Fourier analysis of the sound wave, although there is some incredible research from Carnegie-Mellon published just this last month that our brains do not perceive sound in this fashion, but as a set of “spikes” that represent a combination of the volume and frequency of various component sounds.

stuyguy, I just wanna commend you on asking an excellent question. While I think I understand the answer now (thanks, groman!), this qualifies as the kind of question I wish I’d asked. And I’d love to see a Cecil answer on it.


The science of paper cones has produced some good speakers there is no doubt, but any old paper cone can be made to produce sound just by attaching a winding above a magnetic field and running it to the output of a stereo. Come to think of it, if you attached the same thing to a violin, you could get the violin to sound a lot like a drum kit. Close enough anyway.

The distinctive sound of a violin depends upon the other harmonics that resonate with the note being produced by the strings. If you produce those harmonics with your stereo wires, you’re going to get violin noises. If you produce the drum noise with the stereo, you can attach it to any sounding surface and you will get the distinct impression of drums–with not quite the fidelity of modern paper cone science.

The way I think about it: it may only be paper, but it’s connected to a very powerful electromagnet that forces it to move in any desired pattern. It can move to mimic the vibration of a violin’s sounding board, the vocal cord of an opera singer, a car’s engine, or anything. (To be exact, it’s not mimicking the actual instrument/device, but the pattern of vibration that is caused by those things.)

Am I generalizing people’s answers correctly?

The shape of sound producted by instruments (violin, drum, voice, cowbell, etc) is constrained by the qualities of the instrument. (Shape, material, strings, woodwind, etc.) Speakers are designed to be free of those contraints, allowing them to product a much wider range of sounds?

Why thank you LHoD. And yes, I agree, I’d love for Cecil to tackle this one in a column.

I appreciate all the helpful posts, but could someone explain timbre a little more. I’m sure scotandrsn was good-intentioned when he wrote, “It is traditionally understood to be the sum of all component sinusoidal frequencies as determined by a Fourier analysis of the sound wave,” but I have to confess, it made me think I was reading the script from some vaudeville sketch with Cid Ceasar playing a pompus scientist giving a news interview.

As he said, timbre refers to the shape of the wave. For example, a sine wave and a triangle wave can have identical frequency, phase and amplitude, but because of the different shape, your ear can distinguish the two. That’s timbre.

How Speakers Work

My question going along with this one is: how does that speaker produce many different sounds at the same time?

I understand how a violin string or the column of air in a flute produces a note along with overtones and so forth, but how can one continuous physical membrane reproduce the sounds of, say, a guitar, bass, drum, and keyboard all at the same time? It seems like it should require a bunch of speakers with a bunch of electromagnets, or alternatively parsing those four sounds (and their overtones etc) so that they aren’t really continuous, but this sure doesn’t seem to be the case?

Even when you’re hearing multiple sounds simultaneously, there’s only one waveform that’s hitting your eardrum. All the speaker diaphragm has to do is reproduce that waveform, and you’ll hear it as multiple sounds.

The same way you can hear them. You’ve only got one eardrum per ear. When you listen to a concert orchestra, the sounds of the various instruments all merge together in the ear, creating a deflection in the eardrum. The magnitude of this deflection at any given instant is proportional to the sum of the amplitudes of all the various sounds reaching your ear at that moment. This results in a complex waveform. This is easier visualized with the simple case of mixing two sinusoidal waveforms of differing pitches, like this.

The speaker does the same thing, only it vibrates to create the sound, whereas your eardrum vibrates in response to it.

Aha! Got it. I’m a very visually cognitive person (partly responsible for the conceptual problem), so your link solved that one for me. Thanks very much!

Your ear is a nonlinear element and also has a lot to do with the sound you hear, as does the brain. I’m not a specialist in sound but I know that if the harmonics of, say, a bassoon are reproduced accurately, you will hear a bassoon, or close to it, even though the speaker might not be able to reproduce the fundamental frequency of the instrument. When you hear the correct combination of harmonics, your hearing apparatus supplies the rest.

I think Q.E.D. had the most succint posts on this but since you asked I will rephrase that a musical instrument makes sounds by adding energy to it and allowing it to vibrate at its natural frequencies. The usual example of this is a kid on a swing. Give him a single push (add a little energy) and he’ll oscillate at some natural frequency.

Speakers produce a high range of sounds not because they are free of any constraints (although they are designed to minimize them) but because the voice coil forces them to vibrate in certain ways regardless of their natural constraints. Imagine the kid on a swing but instead of giving him one push, you stand there holding him, pushing him back and forth (constantly adding a lot of energy) at whatever rate you want.

The design of conventional telephone lines to use little bandwidth is based on this idea; the low frequencies are filtered out for transmission but when the sound comes out that little handset speaker your ear puts them back.

I think it’s more amazing that your ear, with what a 1/4" eardrum, can convert the incoming sound with such fidelity.