Bad subtitles

I was watching “A Christmas Carol” (the Patrick Stewart version) on TNT last night with the closed captioning on. According to the captions, the 2nd verse of “God Rest Ye Merry, Gentlemen” is

Which led me to thinking and wondering: where do closed captions come from? In the case of a film, do they stay with that film no matter where it’s shown? Why are some closed captions word perfect, and some are approximations (I’ve seen some that paraphrase/shorten dialog)? How did the above mistake happen – don’t they work from a script?

(Closed captions of live broadcasts, like the local news, are a total trainwreck. It looks like they’re done with speech-to-text software.)

*Jewry.

I have severe hearing problems, to the point where if a program is not captioned I can not watch it. I have many of the same questions as you. Captioning has come a long way in the last 20 years or so, but is still less than perfect. I think some programs have captioning embedded in the program, and others are done “on the fly”, which is less accurate, but I am not sure.
I have noticed that at the end of programs there is usually a disclaimer about the accuracy of the captioning “station not responsible for accuracy…”, and CaptionMax is frequently listed as the company responsible for the closed captioning. So I found their web site and sent them an email asking about 1) Captioning accuracy and how varied it is 2) The suckiness of captioning in live shows, especially sporting events and 3) the fact that dvd releases will caption a movie, but very rarely the deleted scenes, bonus features, or audio commentary. If I get a response I will post it here.

Maybe they though this word was offensive and were just trying to be “politically correct.”

On the other hand, whoever was responsible may have just been a semiliterate moron. There’s a surfeit of them around these days.

As a regular watcher of “The Amazing Race” and someone who usually has captioning turned on, I have to say that their captioning is pretty bad. It always seems to be about ten seconds behind and often has the wrong words. One of my favorites was a few years back when one of the racers made a comment like “I can smell Phil’s cologne from here”, except the captioning replaced “cologne” with “colon”.

Which makes me wonder how reality shows are captioned. They’re not scripted, but they’re not live so you would think they have enough time to get the captioning right before airing.

I saw the movie Instructions Not Included last year. It’s set both in the U.S. and in Mexico, but most of the dialog is in Spanish, so it’s mostly subtitled in the version shown in the U.S. The subtitles were badly messed up, and it’s clear that the problem was that the producers of the film didn’t bother to look at the subtitles even once after they were created, because if they had they would have immediately informed the people working on the subtitles that they had to be fixed. Apparently there is a computer program designed especially for subtitling so that a subtitler can just type in the subtitles as they watch and listen to the film without having to watch the film twice.

The program had two bugs in it that really distracted from watching the film for English speakers. One was that the program was somehow programmed so that it ignored any occurrences of the letter Q as the subtitler typed it in. So a sentence that translated to “The Queen was quite angry about what happened” would be subtitled as “The ueen was uite angry about what happened”. The second was that any digit of a number would be replaced by a 0. So a sentence that translated to “All 7 of us received $256,584.97 in the payout” would be subtitled as “All 0 of us received $000,000.00 in the payout”:

Usually with pre-recorded dramas, comedies, films and documentaries you work from a script. Occasionally one isn’t available, but that’s really only with first broadcasts for major shows (where they just don’t send the script out to anyone) or things that weren’t scripted, like reality shows or home improvement shows, not older films.

Songs will usually not be included in the script, just the title of the song, so you look it up online; I guess in this case the person who originally posted the lyrics online misheard and thought hey, he was given gold when he was born, so jewelry makes sense. I’d expect someone being paid to do something, like the closed captioner, to check a bit more thoroughly that someone posting lyrics online for fun, so the problem was proofreading laziness leading to an error.

To answer a couple of other questions: Closed captioning has a technically-defined character limit for space because you don’t want the writing taking up the entire screen; in the UK it’s 37 characters per line, 3 lines total, with two lines by far preferred, and this includes any character added to make different colours show up for different speakers. There is also a reading speed limit for comprehension, because we hear faster than we can read; the limit depends on the broadcast channel, but there is always a limit.

These two together mean that sometimes you just have to change the words or they’d be too long to fit or be readable. Hence “we’re going to” is often changed to “we will” even though there is a slight difference in meaning.

Dialogue, especially when multiple people are talking, is the most likely area where there are going to be these problems, so it’s true that dialogue is more likely to be truncated. Narration, on the other hand, is generally spoken at a speed that’s easy to fit in and is very unlikely to be truncated.

Medical dramas can often be quite hard to make subtitles/closed captions for. The medical terminology is important to get right, but those words are really long. That means everything else has to be truncated more.

Live captioning isn’t text-to-speech (in the UK, anyway) but involves the use of shortcuts on special software plus normal typing, kinda like a courthouse stenographer used to use but without the possibility of going back to check you got it right. Text-to-speech software might work better one day but at the moment it’s not very good at dealing with ambient noise, different accents and multiple people talking at once.

Finally, closed captions don’t stay with the show/film, IME. Captioning a film requires different software to captioning for TV broadcast and different regions require different captioning software too - PAL and NTSC have different frame rates, for example, which means you can use the whole original PAL/NTSC script, but you have to change all the timings, which also means changing some of, or sometimes quite a lot of, the wording, to fit in with the timings. If you simply tried to use an NTSC script on a PAL broadcast it would go out of time within about three minutes.

TV broadcasts also require small changes to account for commercial breaks; DVDs don’t. Occasionally you’ll see where a station has paid too little for their subtitles and they overun the ad break and just keep on going as if the show were still on.

And if a TV version is even a minute different to the TV version, then, unless that minute is only at the very end, everything after the difference has to be changed, because otherwise all the subtitles will be a minute out of synch. And all you’ll know is that it’s different, not exactly when, so you actually have to redo the whole thing.

The weirdest (most amusing) one I’ve seen recently was while I was watching the movie “Clue” on some minor cable channel. Someone had used find-and-replace to X out all the minor swear words like “hell” and “damn” from the closed captioning. It had the side effect of changing one of the character’s names to “Mrs. PeaXXXX.”

Captioning working very hard.

The movie was probably captioned by Emily Litella. :smiley:

That’s something I love about the SDMB – you post an off-the-wall musing on an arcane subject, and in a little while an expert will wander by and actually offer an informed opinion. Thanks, SciFiSam.