Languages that use "double byte" characters.

Can anyone direct me to a comprehensive list of all of the languages that use double byte characters? Thanks ever so much.

A cursory Googling suggests that of the major languages, only Chinese, Japanese, and Korean require double-byte characters. I’m not 100% prepared to say that’s iron-clad fact, however.

I wondered about some alphabetic languages – like Thai, Tibetan, Burmese, Bengali, Tamil, and Hindi – that use various diacritics and extra letters to modify the basic letter set and essentially yield loads of extra characters. But from what I can determine, none of these types of alphabets requires double-byte character sets.

I know that Arabic-based alphabets are single-byte, as I once lent my computer to a friend to typeset a book in Gulf Arabic. Hebrew is also a single-byte set.

And of course, the plethora of languages with alphabets based on the Roman, Greek, or Cyrillic models are easily accomodated in single-byte schemes.

How many bytes a single character in an alphabet or character set uses depends on the encoding scheme. If you can use one of the extended asciis (ISO-8859-x), it’s a single byte. If you use some Unicode scheme such as UTF-8, it’s probably going to be more.

Because of all the different encoding schemes in use, this is a somewhat complex subject. Here’s a tutorial on character encoding schemes.

Hmm, the board software seems to have broken here, as my post was not reflected in the main GQ page.