How to organize several thousand e-books?

I love e-books. Even though I will never read more than a small fraction of them something about having this massive wealth of knowledge at my fingertips just appeals to me. I have been collecting e-books since before there was a WWW and all there was were simple text files. I now have several gigabytes of e-books on subjects ranging from computer programing to lesbian group dynamics.

Problem is that I never really organized them too well. I half-assed did it once, but have since added several thousand new books that are just sitting in a general folder.

So how should I organize them? How should I label the individual files? What folders should I put them in and how should I label them?

It is probably a job that will take me all weekend.

2 contradictory thoughts:

Idea 1) If you can search on contents, why organize them at all?

Humans build elaborate folder trees to “organize” their stuff mostly because they can’t search on content and so they must be able to search on name and / or location. Breakking things into categeories matches well with the human associative remembering system. And the tree method of categorization gives a lot of pruning per layer, so for example with 10 folders with 10 subfolders with 10 files you can find one file in 1000 by only making scans through three lists of 10 items each, choosing the best at each scan. Hence folder=category trees are ideal for large collections of items that each fit into a single pigeon-hole.

But they’re notso-hotso for things which have complex content which straddles borders or inherently fits multiple categories. Like books.

Given content scan (Windows search, Google desktop search, etc.), just let HAL handle it. Certainly giving each book a meaningful title helps the scan too.

For file system performance reasons, I’d limit the size of any content folder single to around 1000 entries, and simply start another content folder alongside the first when it fills up. Put all of the content folders in a single container folder, and search from the container folder downwards.
Idea 2) Use either the Dewey decimal system or the Library of Congress. Folks who’re expert at cataloging knowledge and organizing it into categories have spent a lot of time refining those systems. Any home-brew system you’d come up with will become a mess if your interests are wide-ranging or evolving over time.

Given that, I’d make one folder per top-level category, say “Dewey 100 - *official category name *”(whatever it is), “Dewey 200 - …”, etc. Then I’d move each book where it belongs based on it’s 1st Dewey digit.

Any “Dewey x00” folder that had more than about 20 books would get a set of subfolders “Dewey x10”, “Dewey x20”, etc., and I’d move all the books into the right subfolder based on their first 2 Dewey digits. If I only had 20 books in say Dewey 300, then I’d leave that alone.; no reason to expand that in to Dewey 310, 320, etc.

This gives finer-grained organization where needed, but doesn’t burden you with busywork hyper-organizing your sub-collection of 3 books on dog grooming. It also means your collection slowly gets more organized as you go along, no matter how long it takes. ie I’d move all the books based on first digit before I began working down into 2nd digit on any of them.
For books which fit more than one categery, at least in your opinion, you can supplement the official cataloging decision by adding shortcuts from other places. eg a book about Einstien & his discoveries that the Powers that Be have cataloged as biography but you think of as Physics. Put the book under Biography and add a shortcut to it under Physics.

Even if you’re not an expert on Dewey cataloging now, you will be when you’re done and you’ll be able to rapidly get to the 650 series when yuo’re looking for books about whatever the 650 series is about.

This is even easier if the books you have already have Dewey or LOC numbers assigned by the publisher. Then you don’t have to make the tough decisions n where to put them.

This way you can also look the books up in online catelogs from any number of public libraries just for the purpose of seeing the Dewey/LOC number. This’ll make it easy for you to catalog them at first, and also easy for you to retreieve them later, using the public catalog as if it was catalog of your private stash, just polluted with a lot of extra entries you don’t own.

Why not use the Dewey Decimal System

Have 10 folders for the 10 main classes, as many subfolders as you need for the minor classes.

The file name could be the title of the book, and the writer’s name in the author attribute.
Or you could divide them into 2 CD’s fiction and non-fiction.
Erm, this is all legal stuff, right? Out of copyright books, books ditributed freely by the author, or copies bought and paid for.

The best way would be to use some sort of relational database. If your books are all of a consistent format and style, you could possibly write some sort of amazon trawler than would pick up all the metadata and store it in a DB. The you could use relational queries such as “Find me all books by an author whose last name starts with S published after 1971 and classfied in the Dewey Decimal system between 290 - 293” and it would return all matches.

Ideally, if ebooks take off in a major way, something like the CDDB will exist for ebooks which will do all the grunt work for you.

Heh, this is the kind of thing I do for fun. I have about 180 GB of scanned comic books, and have spent way too much time organizing them. In this case, I’d label the individual files with the author, title, and possibly LOC code (I don’t like Dewey). So you could have a file such as “Dower, John W. - War Without Mercy (D767.9 .D69 1986).lit”. How you structure your directory tree would probably be a personal thing. I’d probably put the above book in “Non-Fiction\History\World\World War II” with possibly another sub-directory depending on how many WW2 e-books I had. For fiction, I’d probably sort by genre and then also make sub-directories for particular series or authors, such as “Fiction\Science Fiction\Alternate History\Turtledove\The Great War” or “Fiction\Fantasy\Terry Pratchett\Discworld”.

Come to think of it, including the year of publication in the filename might be useful if you want to keep files by a certain author in chronological order. So “Pratchett, Terry - 2002 - Night Watch.lit” would be listed after “Pratchett, Terry - 1989 - Sourcery.lit” in the directory.

Dang, you people that put all of their metadata in the filename is what makes organizing MP3’s such a pain in the ass…

Serious suggestion: maybe iTunes or a program like iTunes? The truth is, I stopped caring about all of your mangled filenames once I stopped trying to organize my own stuff and leave it to iTunes. I know it’s just been upgraded to work with video; maybe it’ll work with your eBooks.

That list in the link is not as uptodate as it might be. You can download a better summary from the official DDC site.

Actually it’s all you people that don’t…

ID3.