Changing British quotation marks to American automatically

I have a couple of HTML files that were written using British-style quotation marks, as in:

‘When I last saw him, Doug said, “I promise never to drink again,” but he’s been seen drunk since then,’ Charles said.

I’d like to be able to convert this to American style, where the main quote is enclosed in double quotes instead of single quotes, and the quote-within-a-quote is enclosed in single quotes. But I can’t just replace double-quotes with single and vice versa, because there are also apostrophes (as in the “he’s” above) that can’t be replaced.

Is there any software to do this? I know how to easily convert straight quotes and apostrophes to curly quotes and apostrophes (or “smart quotes”) in Microsoft Word, and I have a perl script called SmartyPants that will do the same thing. But I can’t find any simple way to do this …

Basically I need a script, whether a word macro or a perl/python/whatever script that will identify balanced quotes and apostrophes and then replace them, I guess. Does anyboyd know of such a utility?

Any help would be greatly appreciated.

There’s a British and American style? Just to confirm that I wasn’t going mad, it took about 5 seconds to find an example in headline stories of the Times using double quotes.

Apologies for the hijack, though, because I can’t help with the actual problem!

Well, in the UK, they actually use both styles, but in British fiction (and especially older British fiction), you’ll see a lot of the kind of use I quoted.

Check out this section of the Wikipedia article on quotation marks, which says:

So I just go with “American style” and “British style,” even though it’s not entirely accurate.

Try changing “space quote” and “quote space”.

You need three steps for each beginning and end quote.

  1. Change [space][double quote] to [space][@@@].
  2. Change [space][single quote] to [space][double quote].
  3. Change [@@@] to [space][single quote].

Repeat with the spaces on the other end. I think that should work.

You may still need to scan for ending apostrophes, as in plural possessives like “for goodness’ sake,” but in Word that’s just a matter of doing a Find for [s][double quote] and scanning for the incorrect ones. You might also have some dropped g’s in -ing words, so there’s a search for [n][double quote].

How long are the documents? Perhaps a final read for any errors would still be quicker than fixing the quotes all by hand.

The documents are several hundred pages long, generally. Mainly manuscripts.

This looks like it could work, without taking too much time; the only time I can see an s" not being an error is if it was a situation like:

‘And then his cars’-- she turned to glare at Bob --‘were found stuffed with corpses.’

So I could search for all instances of s" that are not followed by an em-dash (or a hyphen or whatever). Hmm…

Hmm, but then there’s the ones at the very beginning and end of paragraphs. In word you can find those with ^013 in the Find box; replace with ^p in the Replace box.