# Finding the Scorpion & Bayesian inference — what's the real story?

Howdy! I’m reading The Wisdom of Crowds and the story of John Craven finding the lost submarine, the Scorpion, sounded like there was a technique that could be useful for community/land-use planning. (He relayed the story as given in Blind Man’s Bluff.) Basically, the story goes that Craven had a group of experts, “including mathematicians, submarine specialists, and salvage men,” and they “bet on why the submarine ran into trouble, on its speed as it headed to the ocean bottom, on the steepness of its descent, and so forth.” Then he “took all the guesses, and used a formula called Bayes’s theorem to estimate the *Scorpion’*s final location.” It turned out that the sub was “220 yards from where Craven’s group had said it would be.” Wow!

Unfortunately, it seems that Wikipedia tells quite a different story. According to Wikipedia, Craven consulted experienced submarine captains to create a probability grid and then systematically searched the grid, using Bayes’s theorem to update the probabilities of all the squares in the grid as each square searched turned out to be unsuccessful. Still a pretty amazing feat, but one of considerably different character.

Can, and if so, will, anyone give me the Straight Dope on the search for the Scorpion? The first method hints at a way to come up with a guess that may be spot on; the second gives a way to search efficiently. The first could be useful to me; the second, not so useful.

I think they are describing the same methodolgy. I wasn’t sure how Bayesian reasoning would fit into the first scenario until I read the Wikipedia link you provided. With no description of how Bayes was applied, it didn’t make any sense. Following a couple links, I came across a description of Bayesian search theory. It sounds like Craven’s description only really covers steps 1-4, which doesn’t really use Bayes’ at all; Bayes comes in when you’re updating the probabilities with new evidence (at least, in this case). While I know nothing about the Scorpion event itself, my best guess is that it’s simply an incomplete account on Craven’s part.

At any rate, to touch on what you seem to be looking for, a year or so ago DARPA was going to put up a website that allowed “experts” in various fields bet on certain events (e.g., a terrorist attack, a civil war, etc.) and win money if they were right. The idea being that any one person’s guess (however knowledgable they are) has lots of room for error. It’s a way of harnessing economic theory (free markets “know” more than any single investor); as the sample size of knowledgable guesses grows, the error drops. There was quite an uproar about it and they shut it down fairly quickly. IIRC, there was at least one thread here on the topic (can’t seem to find it, but here’s a link to a BusinessWeek article talking about it).

I also remember seeing a segment on TV where someone had a jar of jelly beans and randomly asked people to guess how many were in the jar. Guesses varied from a couple hundred to, IIRC, 20,000 (!). At the end, the guesses were averaged and jelly beans counted, and the average guess was within 100 of the true number. Nifty.

I’m curious as to what you see as being the application in “community/land-use planning”. Care to give some details?

Right, like the Iowa Electronic Markets (IEM), they would set up something akin to a stockmarket or parimutuel betting system on events; however, the DARPA events were much more prosaic than the press suggests—IIRC, it would be more like Egyptian GDP than terrorist events. I’ve emailed an economist who is heavy into that stuff and, while he has ideas on how to apply it to community/land-use planning, he isn’t giving me any info because the political & legal barriers make it not worth getting into. C’est la guerre, I guess.

Well, the author of The Wisdom of Crowds made it appear as though Craven had asked these varied experts to either bet on specific elements that would determine where the submarine ended up, or bet on its final location (the book really isn’t too clear on that), and then crunch them through Bayes’s rule to get the group’s subjective probability & best guess. He makes this sound to be a good way to get the most accurate answer with a fair degree of reliability.

(I assume that it would be a process similar to flipping a coin and plugging the results into Bayes’s rule to judge whether it is a fair coin.)

If that’s an accurate description, then I think it would be feasible to poll some planners, real estate economists (& similar), and similar folk, asking them some basic questions about what would be best for the long-term health of the community. Perhaps they could be things like the percentage of land that should be zoned agricultural, best lot sizes for districts, best amount of commercial zoning, building density, and so on.

Because The Wisdom of Crowds makes it sound as though sampling randomly is not important for this exercise, I could just send out a bunch of letters and do the math on what comes back. This then would be one piece of evidence that our planning commission could use in revising the master plan.

Because the book makes it sound as though one would be using Bayes’s rule to come up with a best guess from diverse experts, it sounds like it could be utlilized to get a good guess that can be reported. However the Wikipedia article gives a completely different picture. The Wikipedia article doesn’t say that polling the experts is going to give one a spot-on guess right off the bat; instead it gives one a way to conduct a search optimally after collecting all the various guesses. I don’t see any analog between the search process and what I am interested in.

Again, I think the author (sorry, I thought Craven was the author; I didn’t get that he was the person who conducted the search for the Scorpion) either doesn’t get Bayes’ or does a poor job of explaining it.

Not being an expert in probabilistic reasoning, I could be wrong…I’d really appreciate it if someone could inform me as to how Bayes’ would be applied if not through Bayesian search (or establishing a set of prior probabilities for combining).

I guess I don’t understand your question. Presumably one would start w/ a prior guess for some variable and then use the experts’ guesses as data that would be used to adjust one’s prior guess. Hence one is using Bayes’s rule to aggregate expert guesses; as reported in The Wisdom of Crowds, which in turn is relying on Blind Man’s Bluff, this method identified a spot that was nearly on top of the lost submarine. That appears to be completely different from creating a grid of squares, assigning probabilities to each square, and then updating each square’s probability (using Bayes’s rule) as a systematic search finds naught in a growing number of search areas. The first description is a way of aggregating expert guesses to get an accurate estimate; the second description is a method of searching and not a method of guessing. A method of guessing would be useful to me; a method of searching is not, as far as I can see, practical.

I don’t have any references handy so I can’t meaningfully discuss the math. And I haven’t had a chance to get to the library to check out Blind Man’s Bluff.

Well, my question is: just where Bayes’ plays a role in adjusting the guesses? Bayes’ is a method of combining prior probabilities. To wit:

P(X|Y) = P(Y|X) * P(X) / P(Y)

That is, the probability of X given Y is equal to the probability of Y given X times the probability of X, all divided by the probability of Y. This can be strung along for as many pieces of evidence as you wish to consider, by saying P(X,Y,…,Z) = P(X|Y) * … * P(X|Z).

But, as far as I can tell, therein lies the problem with a simple application of Bayes’ – either you’re just effectively averaging guesses (where X is the policy and Y,…,Z are the experts’ guesses) or you have to come up with particular criteria (where X is the policy and Y is one criteria that is assigned a prior, Z is another criteria, etc., from which you can establish the prior probabilities of each criteria based on the experts’ guesses and then combine them). Bayes’ gives you nothing in the first case and in the second you need to do all the work in establishing factors that influence the policy.

Again, as far as I can tell, there isn’t any application of Bayes’ here, as there’s no outline of priors for the evidence in the one case and no updating of evidence (which is what the Bayesian search does) in the other. Also again, I’m no expert; if I’m wrong, I hope someone will educate me.

Unfortunately I have to work on assumptions to try to answer, and they’re assumptions that a possibly incorrect description is accurate.

My guess is that it would be like using the rule to determine whether a coin is fair. I assume a prior that the coin is fair. I use, say, P[H]=0.5 as my prior probability. Then I start flipping the coin and, doing some math that I don’t recall how to do off the top of my head, I use those results to adjust my prior belief. This can be done, I just don’t recall how. It is discussed here; unfortunately, my copy is not handy.

In the case of, say, best proportion of land to keep agricultural, I could start with the proportion currently zoned agricultural and update that based on the guesses; in the analogy the expert guesses are the equivalent of the results of the coin flips. The distribution would be different from that in the coin example, but I think it is conceptually the same idea. I assume that Craven might have had some guess that was then updated with the guesses from the experts.

That’s what fits the description I have in the book.

It really sounds to me like the author just got the details wrong. If you can find similar plans and see what ended up happening when they were implemented, you might be able to use Bayesian analysis to inform your decision. Ro be completely honest, I suspect you’d be looking at GIGO here, but I’m not an expert in Bayesian methods, so I could be wrong.

Well, I’m not familiar w/ GIGO; however the point of the story as it relates to The Wisdom of Crowds is that groups are supposedly good at coming up with factual answers. The book makes it appear as if the guess was spot on; Wikipedia makes it appear as if the guess wasn’t so hot & it was the search method that was so important. That’s the issue for me: if taking a non-random poll of experts (among others) can give reliably accurate answers to questions of fact, then it is a tool to be utilized. According to the author of the book I’m reading, groups are quite good at coming up with factual answers. (Obviously, in my circumstance the questions will have to be written carefully.)

My question is not about Bayesian analysis (at least not primarily), it’s about which telling of Craven’s story is accurate. (Because the experts would presumably be offering guesses that are not from predefined choices, it makes sense that there isn’t just a simple vote. Whether using Bayes’s rule is better than just averaging, for example, would be a question I’d have to look into only after confirmation of which version of the story was being told.)

GIGO = Garbage In, Garbage Out. IOW, if you take a bunch of made-up numbers and apply Bayes to them, you end up with a made-up answer.

Like I said, my somewhat-educated guess is that the Wikipedia story is more accurate.

Yeah, that’s right. But in that case, you’re updating the priors with actual coin flips. In other words, you receive more evidence as to whether your initial guess that the coin was fair is accurate. A decent synopsis that uses coin flipping as an example and relates to Bayes’ can be found here (pdf file).

I’m not clear on where the evidence updates of proper zoning might come from; I think that guesses need to be compared to actual outcomes. In the case of the Scorpion, the prior probability (the location of the ship) is initally determined by averaging the experts’ guesses, while the evidence is updated by actually searching the locations. (The neat part there, IMO, is the incorporation of a search cost.) As neat as Bayes’ is, I think there’s not enough information to use it in the way you’re suggesting.

You’re a little too focused on the Bayes part of the question at the expense of what is more important to me. Of course I don’t have any empirical feedback; this is a process that takes decades. Yet, we still need to make guesses today. I explained my assumptions & guesses about how the experts’ guesses would fit into the analogy, should the analogy turn out to be valid. But those were just wild guesses, as I explained. As discussed, the rule would simply be a method of coming up with the group’s collective best guess, nothing more. There may be other, better rules; there may be other, worse rules; but the book I read made it appear that Bayes’s rule was used to create the collective guess rather than to guide the search. That’s why I asked the question.

I’m sorry for pursuing the issue regarding Bayes’ particularly; just trying to help. (And maybe improve my own understanding.)

Until someone with better information than I comes along, I’ll stick with “a poor job of explaining the situation on the author’s part”.

I am resurrecting this thread as I have seen a few docos on the Scorpion and the Thresher recently.

From what I have read, it seems very poor maintainence may have had (sadly) a great deal to do with the losses.

From Wiki (currently) it would seem that Cravens’ work has been largely discredited.

Anyway, for any interested there is a relatively new book on the market called ‘Why the USS Scorpion (Ssn 589) Was Lost: The Death of a Submarine in the North Atlantic’

I’ve ordered a copy. If anyone has read it, I’d like to hear their take on it.

This won’t work. If your first prior is “the coin is fair”, then you’ll never get away from that prior. What you should start with is something more like “What proportion of coins are fair?”. Which can of course depend on the context: If I’m looking at all coins in random cash registers scattered across the country, I’m going to start with a very high prior probability of fairness, but if I’m looking at all coins at a magicians’ conference (where trick coins probably abound), my prior is going to be much lower.