Most bizarre Amazon recommendation ever

[This is mundane and pointless and I must share it…on the other hand, it involves books. So here it goes.]

I just received an email recommendation from Amazon.com. Here is the essence, verbatim:

Jon Stewart, as in The Daily Show’s Jon Stewart.

If you see a connection between Jon Stewart and Kierkegaard, let me know.

My guess. Lots of college kids like Jon Stewart and buy his books. Lots of college kids have to buy books about Kierkegaard for required reading. Amazon’s program simply notes the correlation and isn’t programmed for causation, so out the e-mail goes.

There is another Jon Stewart (at least I assume it’s a different guy), who wrote Kierkegaard’s Relation to Hegel Reconsidered.

OK, that’s better.

So you know, Amazon’s keyword strings for authors just do simple matching based on the fields for:
Author
Editor
Illustrator
Translator
Foreword Author
and a couple of other values

There are no unique IDs in the system anywhere that will keep two guys named “Jason Miller” from having their books show up as ‘other titles by same author’ even if they both write in different genre.
Also, a search by author name based on a different middle initial will frequently miss the alternate form of said author’s name.
Amazon has some neat software with the occasional ingenious feature, but author matching isn’t a strong suit.

Now it all makes sense. Although it would be most fun if they were the same Jon Stewart.

The basic idea behind Amazon’s recommendation algorithms is that, if you’ve bought something, you’re more likely to buy similar items than you are to buy dissimilar items. In the case of things that they’ve had for a while, they can rate similarity by the number of people who’ve bought both items (more or less), but for new items, they have to go by other clues like the name of the author/band/etc. As pointed out, they’re not really very good at distinguishing between two authors with the same name–I can’t fault them for that, as it’s an instance of a genuinely hard class of problems–so you get occasional missteps like the one in the OP.

The other case where they tend to stumble is in making recommendations based on things that only a few people have bought. I can’t find it any more, but at some point I saw a listing for some kind of meat where the “people who bought this also bought” list was mostly enemas and lingerie. I assume there were only one or two buyers at that point.

I used to get a lot of recommendations about Douglas Adams’s new releases.

Only, of course, he’s been dead for 8 years. So the new releases were science texts by some other Douglas Adams.

Wouldn’t be *that *hard to fix. All they need to do is normalize their database and create an Authors dimension table, with a unique identifier other than the author’s name.

There’s a precedent for that, as they bind each title to an ISBN/UPC or [in the case of items with neither] an ASIN.
Of course, one of the problems with their data integrity is that they allow every single 3rd-party Amazon Marketplace Seller out there [who wants to pony up $40/month for a Pro Merchant subscription] to add items to their catalog in a bulk upload using a standardized upload format.
They’d have to break their compliance with industry standard formats in order to implement such a change.

If you have three publishers reporting the authors of their books, and each one has a listing for one of Jon Stewart, Jon A. Stewart, and Jonathan Stewart, how do you decide which of those map to the same ID?

It might require some manual intervention. Call the publisher, and say “is this the guy who wrote America? or the guy who wrote Kierkegaard and his German Contemporaries?” If neither, create a new entry in the Authors table.

The issue with manual intervention is that you have to pay someone to do it. If that would cost more than the value of what they provide, it’s not worth it. Here it’s not clear that Amazon is really losing much from conflating authors–sure, they send out a few odd recommendations, and customers have to search precisely in some cases, but that’s about it.

In any event, you might want to read this.

I’d assume it’s no worse than the connection between Kierkegaard and his German contemporaries. Kierkegaard was Danish.

No no, the most bizarre recommendation was:
Singin’ in The Rain was recommended because you recently purchased Paint Shop Pro 8. Because all the kids who like to use photo manipulation software are just crazy about musials from the 50s.

Well, one can be the German contemporary of a Danish guy, no? It’s “contemporary”, not “compatriot”.

Amazon.com has a lot of fun weirdness. One time I stumbled accross a 50-pack of condoms for sale, with “Buy it used for less!” shown next to the link to the “Used and New” button.

Heh.
You aren’t technically allowed to do it, but one time I saw someone trying to sell a ‘Gently Used’ baby car seat.
Like that isn’t the nastiest item ever sold on Amazon…