Amazon's book recommendation engine is dumb -- any better ones?

Feels like she’s been in the editing phase for years now. :slight_smile:

OK, let’s consider a hypothetical other book to compare to. Suppose that there’s also a book titled Foo, that only 1% of the general public liked, but that 10% of the people who liked Bellwether also liked. Clearly, then, there’s a significant connection between liking Bellwether and liking Foo, right? And I’m guessing that you think that Foo should be recommended instead of Slice of Pi.

Here’s the thing, though: If I’m a person that liked Bellwether, then there’s a 25% chance that I’ll like Slice of Pi, and a 10% chance that I’ll like Foo. Foo may be more similar to Bellwether, but that doesn’t change the fact that I’m more likely to like Slice of Pi than I am to like Foo, and that it’s therefore a better recommendation.

Put another way, suppose that there were yet another book, Bar, that’s loved by everyone. If I were the one person on the planet who hadn’t read it yet, surely you’d recommend that I pick it up, right?

This is becoming an interesting exercise in probability.

It looks to me as though Chronos wants to read book S (Slice of Pi) if P(S|B) is large, while Gadarene wants to read book S if P(S|B) > P(S).

That is, Chronos wants to read Slice of Pi if there is a large chance that he will like it given that he liked Bellwether. Gadarene wants to read Slice of Pi if the fact that he/she liked Bellwether makes it more likely that he/she will like Slice of Pi than it would have been otherwise.

The problem with this assumption is that, the majority of the time, people buy books that they have not yet read (they buy them in order to read them).

If you want to know what people who liked this book also like (as opposed to knowing what people who bought this book also bought), I suppose you could look through the customer reviews, find the ones that rated the book highly for reasons similar to why you liked the book, and check that reviewer’s other reviews to see what other books they rated highly.

Perhaps this is obvious, but Amazon also has another way to generate recommendations - by rating books recommended based on your purchases. By rating a lot of books and putting books on my wishlist (both their recommendations and other stuff I run across that I want to read), I’ve got this doing a pretty good job of recommending stuff to me that it at least makes sense to look at.

It does an especially decent job keeping up with new books by authors I like and new books about my professional interests. The professional thing is probably because it’s a fairly small, consistent number of people even paying attention to those books.

It does have the tedious habits of recommending every edition, recommending books I own, and making really whacked out recommendations every time I order a baby shower gift (this can be turned off by telling it not to use an item to make recommendations).

My library had a service for a while (maybe they still do) where you could send them a list of 10 books you liked and they would have a trained human librarian make 10 recommendations. I got a reasonable list from that, too.

It keeps offering me Twilight. What a load of crap.

It’s a good recommendation if it gets people to buy it, and it apparently works well enough for Amazon to rely on the system to turn a profit. Remember, Amazon doesn’t care about how well their recommendations work for any individual customer; they only care about how well their recommendation system works on average. As long as people are making purchases based off their recommendations, they’re happy.

You also have to consider that Amazon has millions of items and millions of customers, which means that they need something fast in order to be able to make recommendations at all. There’s a definite trade-off between how sophisticated your recommendations can be and how much time you need to compute them. It’s possible that they could do something more sophisticated, but they have to be clever about implementing it (see Dynamo for an idea of how clever) and it’s not clear that it’ll have a great impact on their bottom line.

I would try Google Group’s archive of Usenet. In it’s day Usenet used to have a lot of book groups and though it’s not used much anymore, you can get a lot of older books you probably haven’t read yet. And since they’re old they are probably at your library

I’m surprised that nobody has chimed in to recommend that you do some work to let Amazon know:

a. What you already own (as noted, this can be a pain, as you have to tell it you own the hardback, the paperback, the tenth anniversary edition, the large print version, etc)

b. What you like (star rankings)

c. What you aren’t interested in

I have spent a number of hours rating the stuff I own (telling Amazon that I both already own and what star ranking I give it) and checking off ‘not interested’. I have also gone through the genres I generally like and ranked maybe a couple hundred books/movies that I’ve read or seen but don’t own.

That invested time has resulted in generally excellent recommendations. Amazon has quite a lot of ways of helping to improve its recommendations if you look into it. The downside is that it takes time and by the end Amazon will know quite a lot about you, which is a bit scary.

OK, I’m on my recommendations now, and can anybody tell me what a “shopping list” is as opposed to a wish list? A shopping cart will add up items right now for purchase. A wish list is stuff I think I want, but am putting on the back burner for future purchases or for somebody to buy me, maybe, as a gift. A shopping list?

What would be helpful is a third selection box, after “I own it” and “Not interested,” that says “Interested.” It would bookmark and compile those items I’ve checked into a separate list that I can come back to later and maybe narrow down. That may be the purposed of a “shopping list,” but the way it is set up now, I have to click back through several pages to return to the recommendations page. Too much trouble.

LibraryThing has a number of different ways to get suggestions, and even has an Unsuggest feature. The more books you enter from your library, the better your results are, because it compares you library to similar libraries.

I use my wish list that way. If it shows up on recommendations and I think I want it eventually, I move it to the wish list. My wish list is full of no end of goodness now!

The shopping list automatically includes items you buy routinely. I don’t usually shop Amazon for that kind of stuff, so I ignore the shopping list.

Supposedly this sort of thing is very hard to engineer and there’s even been reward money offered to anyone who can do it right, so you might be stuck.

I will point out though that there are other ways to get recommendation ideas from Amazon:

  1. The keywords and category sections. Clicking on the keyword or category that is best associated with the quality you like about the book will often lead you to other books with similar qualities.

  2. User’s book lists. These are often more helpful since they are specifically created by users, and have useful titles like “25 weird animated DVDs” or “subversive kid’s fiction” etc. If your book is on somebody’s list, you might like some of the other stuff on the list.

  3. Reader reviews. Often reader reviews will compare the book favorably, unfavorably, or neutrally to other books. If you really love a particular book and are desperate to find similar things you can ask for suggestions there or here.

I would argue that creating an algorithm to suggest books based purely on a list of books you’ve bought or rated is inherently troubled. It’s kind of like asking your wife what you want for dinner. Well, you tend to like pizza, she says (this is like suggesting an author you’ve already read). Or you and my friend Frannie both like pizza and she recently found this cool new Spanish restaurant (like suggesting books popular with other buyers of this book).

People’s book tastes are just too eclectic to make suggestions in a broad based way. You may love hard sci-fi, but are also buying parenting books because you just had kids, and gardening books because you recently got into that, but when you go to the beach you always bring a trashy bodice ripper. There’s no way for a computer to relate these things together into some kind of generalized personal preference psychological profile. It’s much better for you to just go looking specifically for sci-fi books if that’s what you’re in the mood for, which amazon and other sources are perfectly good at, or sci-fi books in a similar style to a particular author, which google and wiki have lots of examples of or can be sussed out using amazon keywords or the like. Just asking for simply books you might like based on all the books you’ve previously liked is 1) too inspecific a query for there to be a useful answer and 2) prevents truly useful suggestions that are outside your normal range of reading.

You’re thinking of the Netflix Prize. Technically they’re trying to solve a slightly different problem (estimating a user’s ratings of a movie rather than trying to compute a list of recommendations), but you’re basically right.

(For the more technical minded here, you would think that being able to estimate whether a user will rate movie A higher than movie B would allow you to put together a ranked list of recommendations, but it turns out that computing the list which minimizes the disagreement with those estimates is NP-hard. Different algorithms are needed.)

It’s not necessarily impossible, but as I mentioned above, Amazon is constrained by their computational resources. Better algorithms and better implementations are needed.

Of course, you can tell Amazon that “I already own this book” and “I’m not interested in this book” via various links on those pages. While I don’t think it improves things, at least you won’t see books you have/aren’t interested in any longer.

Yes, actually, you will. Often, Amazon will recommend a book that I BOUGHT FROM AMAZON. Sometimes I will buy copies of books that I’ve read for friends, but not very often. Also, as someone noted upthread, Amazon regards every edition as a separate book, rather than listing it in one entry with choices of hardback, paperback, trade edition, and large print in HB and PB. If I don’t want a book in paperback, I’m very unlikely to want it in hardback.

I have to admit, I use Amazon recommendations in a way they probably didn’t intend. I notice which authors and books they keep showing me, and if I see these authors at a used book store, then I will pick up one or two. Sometimes this works out very well (I found out that I LOVE John Scalzi, for instance) and sometimes I don’t like the author, but at least I’m not out as much money. So, Amazon might or might not sell me some books by John Scalzi. At least their recommendation engine worked well that time.

One of the things that annoys me is when I rate a book as “I hate it” and Amazon recommends another book - giving the reason that I rated the first. You would think a very bad rating would mean I don’t want others like it, but Amazon’s engine just sees that I rated it (apparently.)

I’ve found Literature Map to be quite helpful. You enter the name of an author you like and it returns a spacial abstract with the author you named in the center, and other authors arranged around the center with similarity based on proximity…it’s too hard to explain, just check it out!

That Literature Map is a fun toy. Bookmarked. Thanks!

Pretty interesting. I wonder how they choose proximity. I checked Jim Butcher and I have to say I would not have put Tanya Huff right up against him, and then have Simon R. Green and Glen Cook so far away. But it still looks like a great source. Thanks.