How did archivists cross reference info before databases?

sweeteviljesus · July 31, 2015, 5:17pm

I was reading Day of the Jackal which describes the French Surete National and its records department. The book said that although the files were not yet computerized, the archivists prided themselves on their ability to return answers to general queries in 10 minutes. What did they do exactly?

(I was in college before I found out that librarians did anything more that check out books.)

Thanks,
Rob

Zsofia · July 31, 2015, 5:19pm

I don’t know about that kind of record specifically, but there used to be a LOT more hand indexing and that sort of thing. (Or see also the way newspapers used to do their morgues by clipping files that were then microfilmed; you’d put a clipping both in ACCIDENTS- BUSES and ACCIDENTS - CARS if a car hit a bus.)

TriPolar · July 31, 2015, 5:28pm

Filing cabinets. Lots and lots of filing cabinets. And lots and lots of people who filed the papers and retrieved them from the filing cabinets. And lots and lots of typewriters and that miraculous invention carbon paper.

Those were the databases. Rarely as well indexed as we now expect from computer databases, but still based on the same basic principles.

PastTense · July 31, 2015, 6:52pm

Are you acquainted with the library card catalog? Read this Wikipedia entry:

So you have a document and describe it with a half dozen indexing terms. You prepare a 3" X 5" catalog card for each of these half dozen indexing terms and then physically file these cards in the catalog. Then a seacher looking for a particular term will see all these cards with that term.

Note the big advantage of the computerized system: with the traditional system you can only search for the limited few terms for which a catalog card was made. With the computerized system you can do a search for a much larger number of tags.

PatrickLondon · July 31, 2015, 6:56pm

Card indexes referring on to files of source documents; at the simplest, alphabetical indexes with separate cards for each likely search category, but in an early form of mechanisation, Hollerith punched cards were used to include multiple sort categories on to one record. Later, manual mechanical “analysis” of cards was replaced by automatic readers linked to computers. I can remember punching data on to cards to be read into a dial-up phone line for entry to a remote computer.

bump · July 31, 2015, 7:29pm

I imagine it was a combination of excellent organization of the file cabinets/indexing, combined with very knowledgeable and well-trained archivists, so that depending on the request, they might have a good idea of where to look in the index, and also where that was in the actual records storage.

Finagle · July 31, 2015, 7:47pm

It’s probably the same problem we’re faced with when asking Google questions: if you know the right keywords, you can find anything. But if you don’t know the questions to ask, then you end up with either the wrong references or too much material. A good archivist would be familiar with the indexing system and would know the right questions to ask.

CalMeacham · July 31, 2015, 8:11pm

I love Day of the Jackal, but I have no idea how they were cross-indexing. File cards is the best guess.
I grew up in the pre-computer age, so I have plenty of experience using various indices. What I say here isn’t relevant to finding a hired assassin holed up in a French pension, but it does shoew how the indexing works.

They used to publish listings of articles and abstracts in big hardbound volumes – *Physics Abstracts, Chemistry Abstracts, Biology Abstracts, The Cumulated Index Medicus * – I’ve used all of these But the biggest and best of all was the Science Citation Index. This listed journal articles from a huge number of publications, and every quarter of the year they issued a new set of paperbound volumes that not only had the listings of the articles, but also indexed them by subject, title, and authors. Once you found a useful keyword, you could look for others listed under that keyword. Or once you found a good set of authors, you could look up their other works. Or those of their co-authors. There was also an index of who had cited which articles from the past, so you could follow chains of references forward and backward.

Every year, they’d publish the results from all quarters smooshed together in a yearly collection, hardbound. Every decade they’d publish a set of hardbound volumes doing this for the previous ten years.
All of these indices took up a lot of space. The first floors of most college libraries were filled with card catalogs and shelves filled with these annual or decadely indices of abstracts. And, of course, the journals themselves were in hardbound volumes, usually in the basement or tucked away in the back. Or, possibly, on microfilm or microfiche.
For non-technical subjects, there were newspaper indices (the New York Times published its own every year), the Reader’s Guide to Periodical Literature, and the 19th Century Guide to Periodical Literature.

There were some sets of abstracts for the various humanities, but even in the computer age these were slower to develop. When I researched Medusa, I had little to draw on. I was reduced to going into the stacks of bound volumes and physically looking through issues of American Journal of Archaeology and the like.

TriPolar · July 31, 2015, 8:44pm

CalMeacham:

They used to publish listings of articles and abstracts in big hardbound volumes – *Physics Abstracts, Chemistry Abstracts, Biology Abstracts, The Cumulated Index Medicus * – I’ve used all of these But the biggest and best of all was the Science Citation Index. This listed journal articles from a huge number of publications, and every quarter of the year they issued a new set of paperbound volumes that not only had the listings of the articles, but also indexed them by subject, title, and authors. Once you found a useful keyword, you could look for others listed under that keyword. Or once you found a good set of authors, you could look up their other works. Or those of their co-authors. There was also an index of who had cited which articles from the past, so you could follow chains of references forward and backward.

The publication couldn’t be the valuable resource it was without computers. Eugene Garfield realized the rapid increase in citable papers couldn’t be comprehensively indexed without computers. Current Contents would have been limited to a few areas of studies in order to be published on a weekly basis. Garfield did what he could to promote online access to the databases in the 60s but ran into resistance from the academic world that didn’t want to pay for the service. Others felt that he had turned basic research into a citation game with more value placed on the number of citations that followed instead of the quality of the content. He was a library scientist, I don’t think he cared at all about such criticism, he knew there was no practical means of making such judgments without providing access to the material in the first place, and his efforts revolutionized the field.

SirRay · July 31, 2015, 8:51pm

One of my undergraduate jobs in college work-study was filing and pulling those library catalog cards (this was the mid-1980s) - as mentioned upstream normally there was a number of references on the cards for filing under different categories. In retrospect those catalogs were indeed database indices, although us narrow minded EE/CS majors thought of “database” only in terms of DBase and the like.

I think popular culture back in the day sort of realized just how “unindexed” different sources of info were* (as opposed to nowadays, where everything is expected to be data-mined), my favorite example being the 1975 “Rockford File” episode “Just By Accident”, where the insurance scam is accident claims filed by people issued fraudulent driver licenses under the name of children who died in infancy - when Jim Rockford asks the bureaucrat in charge of the records office (lined with many volumes of data) if the death records were cross-checked with the licenses she just scoffs and states how much effort that would be.

*Well, at least popular culture that understood that no real-world computer would not self-destruct when asked a confusing or contradictory question, but instead would just display a error condition message.
BTW, just looking up the name of the Rockford file episode took me about 4 seconds - I can’t image how long it would have taken me in the mid-1980s…

Dewey_Finn · July 31, 2015, 8:58pm

CalMeacham:

They used to publish listings of articles and abstracts in big hardbound volumes – *Physics Abstracts, Chemistry Abstracts, Biology Abstracts, The Cumulated Index Medicus * – I’ve used all of these But the biggest and best of all was the Science Citation Index. This listed journal articles from a huge number of publications, and every quarter of the year they issued a new set of paperbound volumes that not only had the listings of the articles, but also indexed them by subject, title, and authors. Once you found a useful keyword, you could look for others listed under that keyword. Or once you found a good set of authors, you could look up their other works. Or those of their co-authors. There was also an index of who had cited which articles from the past, so you could follow chains of references forward and backward.

That reminded me of the “Readers Guide to Periodical Literature”, which I used when searching for articles in mass market magazines. This was available in the reference section of the public library. And when I was working for an engineering company and needed to find manufacturers of various products, I used the Thomas Register of American Manufacturers; fourteen big volumes. For optical stuff, I used the Laurin Spectra Buyers’ Guide. Really everything then involved a lot of paper.

slash2k · July 31, 2015, 9:35pm

Actually, I think in many cases they were BETTER indexed than what we find in computer databases, simply because the librarians and indexers of yore knew that their work would be the only way to find it again, whereas in modern databases there is often the expectation that “it’s in the computer, people will find it” so there’s less emphasis on (read, often no attention paid to) controlled vocabulary and the like.

For example, walk into any library in the 1970s and look at the card catalog for materials about the war that began in August 1914: if they used the Library of Congress Subject Headings, all of the books will be listed under “World War, 1914-1918.” Now try Google: should you look under “World War I” or “First World War” or “Great War”? To be thorough, you need to check all three.

American Indians or Native Americans or Amerinds or First Nations? The American Civil War or the War of the Rebellion or the War Between the States (or the Late Unpleasantness)? Pomerania, Pommern, Pomorze? William the Conqueror, William the First, William I, William of Normandy, William the Bastard? Supercollider, particle accelerator, large hadron collider?

Now a good computerized database or index can far exceed the paper card catalog, but ‘throw it in the computer’ isn’t quite the same thing.

bob_2 · July 31, 2015, 9:50pm

We all know how ‘surfing’ works - you look something up, see an interesting link, click that and so on - an initial query about card indexes could end up in a totally different field like the Amazon Rain Forest.

Encyclopaedias worked exactly the same way. The good old Britannica had the same system of cross links, so that you would be able to see any reference that might be in a different volume and so on. The same principle but just a hell of a lot slower and more laborious.

Senegoid · July 31, 2015, 9:53pm

There was also a scheme that involved edge-notched cards. (See also the Wiki page.)

My junior high school started using these to sort students into classes during my second or third year there. It was a disaster until they got it really figured out.

There was a row of holes near the edge of the card, and selected holes could be punched out to make a notch in the edge of the card (see photos in the first link above). For example, you could punch out one hole for all the male students and another hole for all the female students.

Then, suppose you wanted to select all the male students apart from the females (to assign to a gym class, health ed class, or shot class). Gather the whole deck of students together and joggle to line them all up. Stick a knitting needle through the “male” hole. Then lift the needle up and shake the deck a little, and all the “male” cards fall out. First link has some photos of this in action.

ETA: One or both links above, say this was invented in 1896, but got really popular in the 1950’s and 1960’s.

TriPolar · July 31, 2015, 10:06pm

Then as now unindexed information makes a poor database. Librarians couldn’t tell you where to find information that had not been indexed. At least with computers there’s a chance of finding the information in your lifetime. There’s no less emphasis on controlled vocabulary now, there is just a lot more unindexed information because there is so much more information accessible. In no way shape or form was data more accessible in the past.

slash2k · July 31, 2015, 10:50pm

Actually, I’ve had this debate as a professional: my own management tells me that there’s no need to worry about indexing and controlled vocabulary because “it’s on the computer!” An institution that once employed indexers to prepare paper guides doesn’t anymore because “it’s on the computer and we don’t need those people.”

This is a hot topic within the library/archival community, and yes, there is far less emphasis on controlled vocabulary today than when I went to library school going on thirty years ago. For example, we have vastly better access to fulltext newspapers now, but we have less access to newspaper indexes, because that is seen as an unnecessary frill in an era when it’s all on the computer.

For another example, see the main library catalog at Washburn University and several partner institutions here in Topeka ( http://topekalibraries.info ). The main search page is keyword-based: try a search for ‘Amerind’ and you’ll retrieve three books. It doesn’t even suggest that a more useful search would be ‘Indians of North America’ (nearly 5000 books, hundreds of manuscripts, etc.); you’re just supposed to know to do that.

By contrast, if you go to ‘classic mode’ and type in the subject Amerind, you’ll get a note to see Indians, Indians of North America, Indians of South America, etc. However, ‘classic mode’ and ‘advanced search’ aren’t what they emphasize in the basic intro to the library sessions; keyword searching is, and keyword is the opposite of controlled vocabulary.

chappachula · August 1, 2015, 5:39pm

This system is mentioned as being used by Britain’s MI6 in the book “Spycatcher”*. It’s a “kss-and-tell” expose written by an ex-secret agent for MI6 (the equivalent of the American CIA). The author says that the system worked so well for so many years, that there was a lot of resistance within the organization to replacing it with a scary ,new, and untrusted computer system.

The book was banned for sale in England back in the 1980’s because it “revealed state secrets” (i.e. it was embarrassing to the current politicians )

P-man · August 1, 2015, 5:57pm

Wow, this was a trip down memory lane. Call me old fashioned, but if I were a scientist I’d trade a few thousand hits on the commercial search engines for a couple of lines in Science Citation Index. I’ll have to say, though, that Google and its ilk have made a lot of progress. It can make the life of a researcher easier, but you have to know how to use it.

Topic		Replies	Views
Any libraries in the US that still use card catalogs? In My Humble Opinion	41	10334	January 6, 2014
How did people search scientific papers before the internet Factual Questions	27	2585	February 26, 2006
How Did Library Checkout Work Before Computerizaton? Factual Questions	87	75844	August 15, 2013
What's the process of indexing a book? Factual Questions	17	2407	April 8, 2013
How did libraries work before barcode scanners? Factual Questions	36	3086	June 20, 2002

How did archivists cross reference info before databases?

Related topics