(1) Any machine generated index stored electronically won’t have any need to use cross references. Users will enter a word or name or something and the software will return all the use instances, or perhaps jump to the first (most likely?) target. In the case of green beans you won’t be able to look for beans, green because the index generator will never have encountered that phrase. You could still enter beans, green, but you would get a list of beans and a list of green.
(2) A thorough hand generated index is an order of magnitude better than a machine index. Just look at an index in almost any software user guide. Cross references? What are those? We humans can grasp that green beans might also need a beans, green entry.
With that said I prefer an electronic “index” because it’s a lot faster than a manual lookup.
= = = = =
A long time ago I assisted the publications department for six months. What did I do? Hand generated indexes for the manuals. Tedious work but at least I could use a search capability to find the use instances for the index entries. Best of all was that I decided what would go in the index, and I decided a lot of things needed to be in there.
I published my PhD dissertation as a book last year and had to prepare the index manually. I literally went through the offprints of the whole book, jotted down a few indexable keywords per page together with the page number, and then compiled the index on that basis. The instructions from the publisher were indeed as indicated by the OP, to list (to use the example) “Green Beans” as a sub-entry to the entry “Beans” and give the page number there; a cross-reference from “Green Beans” to “Beans, Green” was permissible, but the “Green Beans” entry should have only that reference, not the page number directly, even if this would have used up less space than the reference. It’s very dull work, I can tell you, and eats up the better part of what could otherwise have been a nice weekend.
As for the reasons to do it that way, I suppose it’s not so much anything practical but rather adherence to an abstract ideal of an edifice of logic. Index entries would be listed by keywords, and if some keyword was, logically, a subcategory of another keyword then it should be listed as such rather than as an entry in its own right, to keep the logical relations between the entries intact. That was, at least, how I justified this procedure to myself as I was doing it.
What do you mean by ‘manually’? I agree that, as it stands, you want to decide yourself what goes into the index rather than let some AI do it, but the computer still sorts the entries, calculates the page numbers, compiles the cross-references, etc. Last time I used xindy.
I did it entirely manually, actually. I read the offprints and wrote down a few keywords for each page. Sometimes I had to go back to the notes from previous pages to ensure that I wouldn’t create a new keyword for a page if an existing keyword that was already on my list from a previous page would do the job just as well (to stay in the example, if I already had “Beans, Green” on p. 25 and then, on p. 178, I would stumble across “Phaseolus vulgaris”, I wouldn’t create a new entry for “Phaseolus vulgaris” but rather write it down as an additional hit for “Beans, Green”). After this stage, I had a long list of keywords, each followed by the pages on which that keyword appeared. I would then sort the keywords alphabetically and consolidate the page hits (in the sense that a keyword appearing on pages 15, 16, 17 and 67 would be listed with the hits “15-17; 67”). I’m sure (and I was sure back then) that it could be done automatically, but I figured it would be faster to do it manually than to familiarise myself with a software that would do it automatically but that I had never used before.
Actually, the easiest way to prepare an index is while writing. As I was writing a book whenever I came on a word that needed to be indexed I added \index{word} or \indexas{wordy} depending if the word was to appear in the text as is or not. In the latter case the word had to typed in addition. All the rest was done automatically. I don’t recall if there was a way to cross index though.
There are indexes and indexes. If the index consists entirely of proper names, then tagging them as one writes is sensible.
If the index contains concepts, groups, generalities, or isms, then it’s a lot harder to do, especially if you don’t know whether they will rise to the level of being worthy of inclusion in the end.
“\index{word|see{otherword}} and \index{word|seealso{otherword}}”
is all well and good but continues to make my case that machine generated indices don’t generate cross references. The example means the author (a human being of course) made a conscious decision to introduce a cross reference.
The example makes it easier for software to generate a better index, but the author does the work to make it happen.
When I was generating indices I more or less followed what Schnitte did. One thing I learned was how adept we humans are at remembering when cross references were needed.
An index is more than a list of nouns with their page cross-reference, its a way of finding a hierarchy of information.
If I look up ‘Green Beans’ [or worse, use CTRL + F] and it takes me to page 50, then all it tells me is that is where the word appears in the body of the text.
If I look up Green Beans and it takes me to ‘See Beans, Green’, then it takes me to the headword ‘Beans’, which then can direct me to the history of beans, who invented beans [Antonia Fazole 1631-1703], bean recipes, other coloured beans, people named after Beans etc.
Green beans are part of the group ‘Beans’ and the subcategories in the index under that head-word relate to green beans even though they may not mention green beans specifically. Otherwise I would go to ‘Green Beans’ on p 50 hoping to find who invented them and be none the wiser, although there could be an entire chapter on Legume Inventors starting on p 54.
My wife just finished her Diabetes book, and the publisher and editor told her that indexing tools basically suck, in being unable to determine what entries are valuable and which aren’t. Maybe it wouldn’t matter for a purely electronic book, but hers had a dead tree edition also.
I think the sub-entry problem is the main reason, plus the “see” link teaches the reader how the indexing is done (there are books about this) so the next time they go straight to the proper entry.
I call green beans, “string beans.” Without the use of see also, I might assume there wasn’t an entry for green beans in the book. Using cross references provides additional access to information, especially for those words with regional differences: shopping cart, buggy, trolley, basket, or wagon.
Word processing programs can’t index concepts. They’re creating a list of words that actually appear in a manuscript. As an example, I worked with an anthology of oral histories of the Japanese American experience prior to World War II. Not one interviewee mentioned the words prejudice or racism; however, their experiences provided multiple illustrations. It’s another reason why indexing is best (IMHO) by hand.
Word processing programs can do a better job of indexing than that. At least, the better WP apps can. As you type in the manuscript, you have to manually tag which words you want to have in the index (I think you can tag whole phrases too). The app takes it from there, collecting a list of all the words you tagged and building the index, and filling in the right page numbers.
What @Senegoid said. In case it was not clear, you can \index whatever you want. The word or phrase need not actually appear in the text. For example, \index{Napoleon!rise to power} will result in an index entry like
Napoleon
rise to power, 123
regardless of whether ‘Napoleon’ or ‘rise to power’ literally appears on page 123.
What is not consistently working (yet) is an engine that “understands” concepts like “Napoleon’s treatment by the British on Saint Helena” and can find all relevant pages in the book, or that can automatically decide for itself what words, concepts, and cross-references are relevant and generate an optimal index.
Yes, that’s what I attempted to explain by saying that word processing programs can’t index. What I should have said was that they aren’t able to “automatically” decide on their own. More precise language, it is!
My first job, at fifteen, was as a page in the local library. I was hired by Miss Milner, the last of the old-fashioned librarians. I have to assume her first name was “Miss” because I never found out any other, and it never occurred to me at any time that there might be one.
I agree, and I’d like to add that part of my reason for doing the index manually was a certain level of distrust towards too much automation in my thesis for precisely the reasons you mentioned. Another instance was source referencing in the footnotes. Word includes an option whereby you enter bibliographical details for the sources you used into a dialog box, indicating which type of source it was (a monograph, a book chapter, a journal paper, a court case etc.). In your footnote, you would then select the source you want to cite and indicate the precise page number. The idea is that you can select the desired citation style from a drop-down menu, and Word would automatically generate the footnotes as well as the bibliography. I experimented with this feature, but ultimately came to distrust it, and decided that I would type the footnotes and the bibliography manually as I was going along writing. More work, but certainly more control over the output. It was the same type of distrust for the automatic feature that was part of why I did the index manually.