Do all languages have something similar to alphabetical order?

I’m not in the mood just this instant, but one would put a bunch of Chinese words into Excel, sort them, and see what happens.

I suspect they would be sorted in the order of the Unicode values of the characters which make up the words.

Nope. I suspect the program remembers what you typed to get the character and orders them according to that.

I inputted the following:

  1. 大 (by typing oo)
  2. 大 (by typing dai)
  3. 先 (by typing saki)

Excel correctly sorted them 1, 3, 2.

Computer translation is fun:

Google Translate renders this as “Well, everyone depending on your social security number, mother’s Alliance”, which likely means you aren’t very rusty at all, given the sad state of Chinese-to-English translation in modern automated systems.

The larger question is, why would people know their mother’s social security number offhand?

I doubt it. How you type the character isn’t stored by any program I’m aware of, and certainly isn’t something inherent to standard character encodings like UTF-8. If Excel happens to store this information, then it’s going to give you different sorting results from files produced with Excel and those imported from another format (such as CSV).

Okay. So how does Excel correctly differentiate 大 (oo) and 大 (dai) for alphabetization purposes?

Interesting. I just used The Alphabetizer to sort that list (I used C&P) and got 3, 1, 2. I’ve been using that site for a couple of years to sort Korean words in hangeul and it’s never been incorrect (using the current South Korean sorting order).

Excel does the same if you C&P the three characters into it. It’s something in the typing that seems to make the difference.

Are you sure it does? That is, have you constructed a test case which eliminates the possibility that you got the correct result through chance, or through some other dependent condition?

The two characters you’ve posted have exactly the same Unicode point, so if they were copied and pasted from this web page into any spreadsheet program (including Excel), it would have no way of knowing which was which.

Yep. I’ve tried a number of different things, yet never had fail to produce the expected result. Most notably, I made a more elaborate test using 7 different pronunciations for the character 生. Excel correctly sorted them into the order ikiru, umu, ki, sho, sei, nama, hae.

Indeed :slight_smile: (that was my point in my last post).

Okay, I get 3, 1, 2, in Excel, too, which is a combination of Unicode and a quick sorting algorithm. To be fair I tried to set the language to Chinese, but apparently I don’t have the proofing tools for Chinese installed, so there’s that. Hanzi was C&P’d from this thread directly.

Of course they would if they were always sorted that way. Which seems likely.

In my HS, there were two kids born on the same date with the same first, middle, and last names. They were sorted by the name of the street they lived on. Suppose one moved?

Steven Wright once pondered: Why is the alphabet in that order? Is it because of that stupid song?

Joking aside, I believe letters doubled as numbers (besides Roman numerals). So then, the numeric order carried over as the alphabetic order.

Here’s the Wikipedia article about how alphabetical order works in various countries:

Just noting that Spanish* just (as in 2010) changed its sorting rules for ll and ch. They used to be considered one letter (‘ll’ followed ‘l’ in the alphabet, so ‘ly…’ came before ‘lla…’ in alphabetical order), but are now considered two letters for alphabetical order purposes.

  • That is, an international association of Spanish language organizations, which presumably libraries and such accept the guidance of .

Nope, the Latin alphabet is derived from the Greek, which in its turn has its roots in the Phoenician alphabet and the order was there already from the beginning.

Anyway, sorting is not as straightforward as one might think it is. For instance, how do you sort numbers? 1, 2, 3, 11, 22, 33 … or 1, 11, 2, 22, 3, 33 …? Do you want to omit leading definite and indefinite articles etc? I once sat in a working group to decide about a standard for sorting (in Sweden). There were people from various fields (I represented libraries) and in the end we came to the conclusion that is was impossible to have a standardised way of sorting as we all had our different traditions and needs.

I’ve forgotten, but they definitely order them by number of strokes. It may go by stroke order within groups. They write the strokes in a certain order and this may determine “alphabetical” order.

Or they can use Pinyin.