First electronic voice recording?

I’m reading an article on the B-58 Hustler, and it says that voice warnings were stored on tape. Nowadays we’d have the voice stored digitally. So I got to thinking that in the '50s (when the Hustler was developed – it went into service in 1960) a typical warning would be a tone. Tones can be modulated, so voice would be a possibility. But could they be modulated fast enough? Apparently not, back then. Voice emulation in video games in the '80s seemed a bit crude.

So when was the first ‘electronic’ voice system developed? By ‘electronic’, I mean an electronically-generated voice and not one stored on tape.

Electronic voice recording might mean different things, and would be different from an electronically generated voice.

A voice might be recorded and played back mechanically (in a needle groove), by analog electronics (usually on magnetic tape), or by digital electronics (using analog-to-digital converters and a data storage medium–maybe also tape!). Commercial labs started using digital audio for mastering in the 1970’s, but music was still distributed on analog media for years after that, and solid-state storage has only recently become practical for more than very short clips.

Most “speech synthesizers” even now simply use bits of recorded speech, (analog or digital) to piece together spoken messages. This includes most telephone, car, and cockpit voice messages that I’ve heard of to date. Those video game voices in the '80s were mostly recorded human voices, processed to sound “electronic”. (All of R2D2’s “electronic” beeps and whistles were emitted by a human performer!)

Even now, the best true digital text-to-speech synthesizers are somewhat hard to understand, and not really suitable for a cockpit. The best ones I’ve heard to date are the newer synthetic announcers on NOAA weather radio, which were phased in just a few years ago.

The first computer text-to-speech generators that I know of came out in the early 1980’s. The Amiga had one built in. They were pretty bad unless you carefully misspelled everything so it was easier for the program to parse phonetically. (Then they were still pretty bad.) I recall reading about phonetic speech synthesizers in the late '70s, but I think those were mostly lab experiments.

I wouldn’t be surprised to hear someone tried to create an analog speech synthesizer before that. Certainly musicians have long tried to reproduce speech sounds using whatever instruments were at hand, and there were effects devices that musicians used to voice-modulate their instrument’s sound. Those can sound like “electronic speech,” but it’s really another form of speech processing.

I meant a recording of a human voice that is not stored on a movable medium such as tape, vinyl disc, or computer disc; or a synthesizerd voice created by the device that is not a recording.

Wikipedia’s entry on “Digital” recording.

By “movable” I assume you mean accessed mechanically, as opposed to “portable.”

So for human voice recording, we’re basically talking about a digital recording stored in a memory device, probably an EPROM. (I’ll neglect core memory as being too bulky.) Playing very short messages would have been possible as early as the 1970’s. But memories then had relatively small capacities, and it took a while for compression methods to be developed.

I remember seeing answering machines with crude, limited digital OGM storage sometime around 1990. That’s probably about as soon as such a device could be used in a cockpit, more or less. (The military might have early access to technology, but they usually take longer than the private sector to accept that a technology is reliable and effective.)

Synthetic speech might have actually come into use sooner than that, but it might have a rather limited repertoire of phonetic messages, each fine tuned to be as understandable as possible.

Frankly, I don’t understand why you reject mechanically-reproduced voices. The pull-string talking mechanism in a 1960’s G. I. Joe was compact, entirely mechanical, remarkably clear, and probably EMP-proof. The B-58 probably had something more sophisticated, but I don’t see why it would have to be solid state.

The only reason I reject mechanically-reproduced voices is because I was curious about when non-mechanically-reproduced voices became available. It’s a thought that’s occurred to me from time to time; for example, when I see Star Trek (TOS) and they mention ‘memory tapes’.

I worked for HC Electronics as a repair guy during the summer in the late 70’s. They had an all solid state voice synthesizer at that time. It was bigger than a brick, and heavier…

Fair enough. I doubt anyone in the '60s would have believed the amount of solid-state storage that we almost take for granted nowadays. I’ve got a postage-stamp-size 2GB flash drive on my keyring that I often forget is there because it’s so small, in both size and relative capacity!

Even so, tape cartridge memories held on for a long time, and mechanically-accessed storage is still predominant. For the time being, magnetic hard drives are still far faster and larger in capacity than solid-state memories.

Tomorrow, who knows?

Holy cow! Do you remember how it was programmed? And how much it cost?

Are we talking voice synthesis, or digital voice recording and playback?

The 1980 arcade game Crazy Climber had a sampled human voice.

As I recall, it was all EPROM - the thing just had bank after bank of 2732’s, a microcontroller, and a D/A converter. The electronics were really pretty straightforward, it was just limited by the small amount of storage available at the time. This device was the pre-cursor to Hawking’s speech box - it allowed completely mute people to “speak” for themselves. It had a large matrix of buttons that could speak common words, and then all of the phonemes, so that other words could be “assembled.” I believe it cost many thousands of dollars…

ETA: The Internet is amazing. Here’s some information on the box: http://americanhistory.si.edu/archives/speechsynthesis/ss_votr.htm
http://aac-rerc.psu.edu/pages/news/pdfs/English%20Script%20MBW%20video.txt

A photo: http://www.rehab.research.va.gov/jour/02/39/6/sup/Vandef09.jpg

Also, Atari prototyped a sampled voice for “Atari Baseball” in 1979, but it never made it into the final product:

No answer, but another aviation data point. I’ve been watching airliner crash stuff on youtube lately. I noted that in the 1985 Delta 191 L1011 TriStar crash at DFW, the recorder captured one of the plane’s warning systems announcing “Pull up!” just before the crash. Was that announcement on tape or not? That’s 1970’s technology unless it was part of an 80’s era electronic upgrade.

I don’t know the answer to that, but are you sure it was from a Delta crash? The reason I ask is that AFAIK the NTSB doesn’t release cockpit voice recorder recordings.

Hmmm…well, there are all sorts of youtube videos about this crash and many seem to share the same audio. I guess it could be the same recreation. I know nowadays they don’t release recordings other than transcripts, I thought maybe in the 80s the rules were different.

I don’t know. I just googled ‘cockpit voice recorder’ and found a site with U.S. crashes.

It was my understanding that the NTSB would not release the recordings, but would release transcripts. Also, I think control tower/center recordings were releasable. But I could be wrong.

I think the OP established that there were cockpit voice annunciators as early as the 1950’s, at least. The question is how soon they became digital recordings or syntheses.

For those who doubt the reliability of a magnetic voice recording for pre-digital cockpit signals, I found a relevant quote. Unfortunately it’s in a book teaser that omits the original source date, but it sounds like an early context (contemporary with Link trainer experiments):

This book sounds like a good source for anyone who wants to pursue the topic.

The quote confirms my suspicion that aerospace engineers would have come up with far more reliable mechanical voice playback systems than the sort of audio tape consumers used. Solid state devices might not have been considered reliable enough until quite some time after they came into laboratory and commercial use.

We’re talking solid state storage so the first programable ROM was invented in 1956. It would have to be after that date.

I found a reference suggesting the London Underground’s “mind the gap” warning was coined in 1968 with the limitations of a solid state recorded digital audio warning message in mind. The Wikipedia article is unclear about how long after this it took to actually implement the warning system.

Both an example of digitally solid state recorded voice, and an (extremely limited) speech generator: Speak & Spell (toy) - Wikipedia - release date: June 1978.