Languages that use "double byte" characters.

Binarydrone · November 21, 2003, 4:29pm

Can anyone direct me to a comprehensive list of all of the languages that use double byte characters? Thanks ever so much.

bordelond · November 21, 2003, 4:45pm

A cursory Googling suggests that of the major languages, only Chinese, Japanese, and Korean require double-byte characters. I’m not 100% prepared to say that’s iron-clad fact, however.

I wondered about some alphabetic languages – like Thai, Tibetan, Burmese, Bengali, Tamil, and Hindi – that use various diacritics and extra letters to modify the basic letter set and essentially yield loads of extra characters. But from what I can determine, none of these types of alphabets requires double-byte character sets.

I know that Arabic-based alphabets are single-byte, as I once lent my computer to a friend to typeset a book in Gulf Arabic. Hebrew is also a single-byte set.

And of course, the plethora of languages with alphabets based on the Roman, Greek, or Cyrillic models are easily accomodated in single-byte schemes.

dtilque · November 21, 2003, 7:18pm

How many bytes a single character in an alphabet or character set uses depends on the encoding scheme. If you can use one of the extended asciis (ISO-8859-x), it’s a single byte. If you use some Unicode scheme such as UTF-8, it’s probably going to be more.

Because of all the different encoding schemes in use, this is a somewhat complex subject. Here’s a tutorial on character encoding schemes.

dtilque · November 21, 2003, 8:34pm

Hmm, the board software seems to have broken here, as my post was not reflected in the main GQ page.

Topic		Replies	Views
Testing Japanese About This Message Board	0	678	November 10, 2000
Non-Roman Character Display on the Board About This Message Board	19	1213	March 19, 2007
Testing Japanese About This Message Board	5	889	November 14, 2000
Is everything in Unicode now? Factual Questions	13	1419	May 12, 2009
What modern languages use complex writing systems? (like Chinese) Factual Questions	20	1650	December 30, 2005

Languages that use "double byte" characters.

Related topics