Forgive me if this seems dopey, but it has been over a quarter century since I’ve been to school (college wasn’t school. it was a kegger!).
What is the rule when making a list alphabetically, when one word contains a hyphen?
Why does Wal-Mart come before Walgreens in the business white pages of the telephone book?
My last name is McCxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
and it always gets screwed up on alphabetizing
Sometimes it’s alphabetized as two words as in Mc Cxxxxxxxxxx
and sometimes it’s alphabetized as one word… McCxxxxxxx
Where that comes into a problem is when searching on a computer by last name. Since you never know how it was keyed in, I frequently just have to search by Mc, and my first name.
Prior to computerization, the general rule was punctuation followed by lower case letters followed by upper case letters. (It was that convention that contributed to how the ASCII assignments were made for the various characters.) Subsequent to computerization, the lists follow the ASCII (or, on IBM or Burroughs mainframes, EBCDIC) patterns. Interestingly, the rules for numbers were not consistent, as demonstrated by the fact that ASCII places numbers before letters while EBCDIC places numbers last of all. Find text books from thirty years ago and you will see numbers placed at either the beginning or end of the index in different books.
There are no “rules” for alphabetization. There are only styles.
Styles come in all shapes and sizes. Some call for strict letter order across entire entries. Some separate out entries with breaks - whether a space or a hyphen or other not-letter character - from those that are longer.
So whoever is putting Wal-Mart before Walgreen is emphasizing the Wal over the Walgreen.
As long as you apply a style consistently it doesn’t matter which one you use.
Yeah, computerization has made the whole thing a lot more complicated.
Before, i was always taught that Mc and Mac were both alphabetized as if they were “Mac.”
So, alphabetical order would be:
Macadam
McDonald
Macpherson
Marsden
Now, of course, a computer would order them:
Macadam
Macpherson
Marsden
McDonald
With numerical order, it’s got even more complicated within computing. For example, on my old computer, running Windows ME, files with the following numbers would have been ordered in the folder thus:
01.jpg
011.jpg
1.jpg
12.jpg
23.jpg
4.jpg
But Windows XP’s file arranging system attempts to sort numerically-named files in a more intuitive manner, so on my current computer these files are ordered:
And you’ll notice that i never said “the rule was.” I said “I was taught.”
You yourself have said that “As long as you apply a style consistently it doesn’t matter which one you use.” Well, the best way to be consistent is to stick with a single style unless given a good reason to do otherwise. This is probably why my school teachers figured that they would teach us one style, so at least we would all be consistent in school. That’s not bad teaching, it’s smart teaching.
To clarify: Does this mean that the order was abc…xyzABC…XYZ, or that it was aAbBcC…xXyYzZ?
I’ve also seen M[sup]c[/sup] treated as a single character, which falls in between M and N. Occasionally, one even sees names starting in “Mac” treated as if it were this special character.
And to follow on to mhendo, OSX 10.4 uses the “new” system of sorting numbers, too. It’s nice when foo10 follows foo9 instead of being between foo1 and foo2, but it’s annoying when you get bar01 bar1 bar02 bar2 bar03 bar3.
benAvram
binAbdul
Binabally
Subject to Exapno’s caveat regarding styles vs rules, you can see how they tended to work by looking at the ASCII chart. As noted, M[sup]c[/sup] and M[sup]ac[/sup] were often given separate treatment that did not conform to a consistent pattern across all usages. Similarly, O’Donnell would tend to have appeared before Odalisque (the apostrophe trumping the capital D), but it really did depend on the style of the publisher. (For example, benAvram and binAbdul were liable to be treated as Benavram or BenAvram and Binabdul or BinAbdul by publishers unfamiliar with Middle Eastern nomenclature.)
Actually, many of those will sort numbers as if they were written out. That is, the number 1 will be sorted between ond and onf[sup]1[/sup] even if written as a digit. Dictionaries still do this. For example, M-W Collegiate Dictionary has the entry 12-step immediately after twelve-month and 24/7 immediately after twenty-fourmo. Note that most dictionaries also ignore punctuation and internal spaces in their alphabetizations.
[sup]1[/sup] M-W doesn’t have any words starting with either ‘ond’ or ‘onf’. The words immediately before and after the ‘one-’ section are oncoming and ongoing, which look like they should be antonyms, but aren’t.
Back about 25 years ago, part of my job was filing cards into library catalogues, so I can remember what the rules were then, and I’ve seen them change as computers took over the job.
Basically, it has always been filing word by word. (Though most dictionaries file letter by letter, so that it doesn’t matter if you have “black bird”, “black-bird” or “blackbird”). However, back in those card filing days, the library had a rule that hyphens were treated as spaces if each part was a word, and ignored if one of the parts wasn’t a word. Of course, that’s a rule that you can’t give to computers.
We used to file “Mc” as “Mac” – that’s another rule that’s gone.
Numbers were filed as if written in full in the language of the title*. So you hads to be a bit of a language expert: the movie title “8 1/2” was filed as “Otto e mezzo”, because, of course, it was in Italian. I remember have a Serbian title starting with a number, and having to work out what the number was in Serbian … computers have taken all that fun away from us.
And then there is the problem of abbreviations like “U.S.A.”: do you file that as if it is “Usa” or as if it is “U S A”?
This gets even more fun when you begin to consider non-US character sets.
The new international standard for encoding and storing alphabetical (& syllabic) characters is something called Unicode, which includes details on such things as how to alphabetize Thai.
The technical term for “alphabetize” is “collation”, and here’s more than you’d ever want to know. http://www.unicode.org/reports/tr10/ . The first couple of pages-worth of text give a good non-technical introduction into the problem of internationally standardizing somethng that’s far from standard. Even with a common alphabet, the Germans collate differently from the Swedes.
You teach the concept of alphabetizing schemes; you teach examples of different types of alphabetizing schemes; you assign lessons that involve using one of those examples of alphabetizing schemes.
What you do not do is say, here’s the way to alphabetize*.
To be fair, I have no way of knowing whether mhendo’s teacher ever did anything that wrong or whether that’s just what got retained out of that class. What teachers teach and what pupils hear are often two wildly disparate things.
except to say that “you will use this method in this class and not the others”