How Do The Chinese Alphabetize?

Or for that matter Japanese or Korean or other languages that don’t use an alphabet.

How would they file things? The closest I found when Googling was they put things in order of number of strokes on a symbol? But that sounds really cumbersome at best. I mean if a symbol for a word has 5 strokes there must be hundreds of words that can have 5 strokes.

From what I know…

Orderings in chinese are indeed a little obscure, and probably hard to figure out quickly, thus ‘cumbersome’ isn’t a bad term. I heard that it took a while to figure out the order of the countries for the 2008 Olympics opening ceremony, though that might be exaggerated.

Korean uses an ‘alphabet’ system much more like the latin/cyrillic/greek than the Chinese logograms, at least in number of complexity of letters. Japanese can apparently use either a short alphabet or the Kanji derived from Chinese, but almost never do ordering based on Kanji.

One notable point is that in Korean or Japanese, the letter orders in the alphabet are not as firmly defines as they are for English - there are different orders used in South Korea and North Korea, for instance.

You can’t really compare Chinese, Japanese and Korean–they all have very different ways of writing.

I don’t know about Chinese, but Japanese uses the phonetic order of hiragana and katakana, which are syllabaries, and have an order the way the Roman alphabet does. So if you want to look up a word in a Japanese dictionary, you’ll index it with the phonetic character in hiragana or katakana, even though you wirte it with the Chinese character. From what I understand, Chinese logographs as used in Japanese writing (kanji) all have a phonetic equivalent that can be written in hiragana or katakana.

Korean has a phonemic alphabet (and rarely uses Chinese characters today) which also has an order like Roman alphabetical order. It was created by a group of 15th century linguist, gathered together by King Sejong, who realized that it didn’t make sense to write the sounds of the Korean language with Chinese characters.

The issue of Chinese alphabetization had wide-ranging effects in the old Chinese empire. Basically, each bureaucrat had his own system of filing, which he might share with his eldest son but few others. This created job security for generations, but also entrenched the buracracy so it was hard to reform.

These two links may be of help

Why Chinese Is So Damn Hard (The whole article is good, that alphabetization part is half-way through).

The Need for an Alphabetically Arranged General Usage Dictionary of Mandarin Chinese: A Review Article of Some Recent Dictionaries and Current Lexicographical Projects it’s an abstract, the 32-page pdf article is here

There is no single way of doing it. Mair lists in his article 14 by name plus countless more.

Japanese characters are normally encountered in the gojūon order. Note that they are read top to bottom, right to left from this page, so the order starts “a i u e o ka ki ku ke ko…” Traditionally, the iroha order was used, but this is less common in modern ordering.

This ordering is based on the kana. Kanji is ordered based on its reading, of which each kanji usually has at least two. In a kanji dictionary, the characters are sorted by stroke radicals, not sound or spelling.

To expand on this, Japanese kanji can be sorted a few different ways. The list in the index of O’Neill’s Essential Kanji first sorts them by total number of strokes (this is one reason why knowing the proper way to write each character is important; native speakers get tripped up on this a lot, especially with the prevalence of automatic conversion software). Kanji dictionaries for native speakers may omit this step.

It then sorts them by the type of primary radical (many characters are made up of multiple smaller elements, but one is recognized as the ‘defining’ radical): whether its left-half, top-half, surrounding, or “other” (such as when the entire character is one radical).

Within that division, they are then sorted by the number of strokes in the main radical: First left-hand radicals from 1 stroke to x strokes, then top-half radicals from 1 stroke to x strokes, etc. Although there is an official “sequence” for the radicals which is listed in the back of the book, O’Neill doesn’t use it to order radicals with the same number of strokes.

I don’t know how kanji are further sorted beyond this point, but by then it’s usually narrowed down far enough that picking out the one you’re looking for is pretty easy.

This question has already been discussed here before. The short answer is that the traditional way of sorting Chinese characters is by radical (Chinese characters are all derived from a limited number of radicals) and then by number of strokes.

Another more modern system is to classify them by their pinyin romanization.

This came up during the Olympics this summer, with regards to the order contries marched in the parade of nations.