Physical properties of the human voice

What are all the factors that go into a particular human voice? Pitch is the obvious one, but what are the others? Why does a man singing at the top of his range sound very different from a woman singing at the bottom of her range, even if they are singing the same note?

How close is audio and computer technology to being able to analyze a particular person’s voice (say, someone famous like JFK), and recreate/synthesize it from scratch, so that (for example) a completely new, synthesized speech could be played that sounded exactly like JFK? What are the technological/engineering barriers to this?

Other terms used to describe sound besides pitch are timbre and loudness. Here’s one very succinct explanation of them:

I don’t know much about the 2nd question, but as mentioned, the first has mostly to do with what’s referred to as timbre. In fact, it kinda cuts right to the definition of what timbre actually is.

The sound waves produced by a human voice are incredibly complex compared to those of musical instruments. Here’s a decent intro I just found to how sound waves are synthesized and what harmonic overtones are. With your voice, the overtones present would be non-harmonic (mostly) and very complex. So when you sing a note, you perceive the fundamental frequency, but there are many many other frequencies also present in the sound.

Not necessarily true. When singing in his upper range, Freddie Mercury could easily have passed for a female singer. I’ve noticed this with others as well.

But in general, what you’re referring to, timbre, is applicable to musical instruments as well. There’s a great deal of overlap in the ranges of a violin and a viola, yet the same notes played on each sound noticeably different.