Extraordinary Claims, Extraordinary Evidence, And Why We'll Never Agree

Carl Sagan’s claim that ‘extraordinary claims require extraordinary evidence’ is often repeated in the face of phenomena that seem to defy prior understanding. A good example is the recent (apparent) discovery of faster-than-light neutrinos: even though the experimental finding is robust with respect to usual standards, the claim is widely met with skepticism – and, I want to argue, rightly so.

There’s a rigorous notion behind Sagan’s aphorism, and it’s to do with what’s called Bayesian inference. The gist of it is that you enter any given situation with an expectation of what you’ll encounter, given by an assignment of probabilities to certain possible events. This probability distribution is called the prior probability, or just prior for short.

Depending on what actually occurs, you get new information about the situation you’re in; this information changes your expectations, and thus, leads to you assigning a new probability to certain events that may occur in this situation, the new probability distribution being known as posterior probability or posterior. Basically, you get more familiar with the situation, and your knowledge of what to expect becomes more accurate. This process is known as Bayesian updating.

This notion is particularly useful for hypothesis testing: given a certain hypothesis, how likely is it that the observed data is consistent with it? In other words, how likely is the hypothesis true?

To get a grasp of this, it’s best to see it at work. Let’s say there are two identical cookie jars, A and B. Both are filled with 20 cookies. In jar A, 10 cookies have chocolate chips in them, while the other 10 are plain; in jar B, only two cookies have chocolate chips, 18 being of the plain variety. You reach in, and pull out a chocolate chip cookie. The question is now: what is the probability you should attach to the hypothesis of having jar A?

Let’s start with figuring out the prior. Both jars are indistinguishable, so the probability of selecting either must be equal; since the total probability must be one (you definitely select one of the two jars), it follows that you assign a probability of 50% to having either jar.

The probability of the data arising – i.e. of selecting a chocolate chip cookie – is given by the sum of the data arising in the case you have jar A times the probability of having jar A and the data arising in the case you have jar B times the probability of having jar B, i.e. 0.5 * 0.5 (the probability of getting a chocolate chip cookie if you have jar A is 50%, and the probability of having jar A is 50%) + 0.1*0.5 (the probability of getting a chocolate chip cookie if you have jar B is 10%, and the probability of having jar B is 50%) = 0.3.

Now, Bayes’ theorem tells us that the (posterior) probability of the hypothesis being true – i.e. of you having selected jar A – is equal to the probability that the data arises if the hypothesis is true – i.e. that you select a chocolate chip cookie if you have jar A – times the prior probability, divided by the probability of the data arising. In symbolic form: P(having jar A if drawn chocolate cookie) = P(drawing chocolate cookie if having jar A) * P(having jar A)/P(drawing chocolate cookie).

This means that after having drawn a chocolate cookie, you should assign to the hypothesis of having cookie jar A a probability of 0.5*0.5/0.3 = 0.83; you have become much more confident that you indeed have jar A. This is as expected: jar A contains many more chocolate cookies than jar B, so the data of drawing a chocolate cookie supports the hypothesis of having jar A.

But the crucial point in this analysis is that your judgement does not depend exclusively on the data you receive, but also on the prior probability you assign to the hypothesis. So let’s examine the same experiment, but with the difference that your mother has given you the cookie jar, stating that it’s jar B.

Say you trust your mother (as you should), but everyone is fallible, so perhaps she has confused the jars; so you assign to the hypothesis of having jar B a probability of 90%, and consequently, to the hypothesis of having jar A a probability of 10%. Again, you draw a chocolate chip cookie out of the jar – i.e. you make exactly the same observation as in the previous case. What’s now the probability you should assign to the hypothesis that you have jar A?

Well, things proceed just the same way: the probability of drawing a chocolate cookie is now 0.50.1 (the probability of drawing a chocolate cookie out of jar A, times the probability you now assign to having jar A) + 0.10.9 (the probability of drawing a chocolate cookie out of jar B, times the probability of possessing jar B) = 0.14. So, the probability you assign to the hypothesis of having jar A after drawing a chocolate cookie is now: 0.50.1/0.14 = 0.36! You do now assign a substantially higher probability to that hypothesis – but since you were very confident in its being wrong beforehand, the data does not suffice to make you change your opinion; you’re still pretty certain that you have jar B in your possession (after all, your mother told you so). For a quick check, let’s look at the probability you assign to possessing jar B: 0.10.9/0.14 = 0.64; you’re still more certain that you possess jar B, even though the data would be more in accordance with the hypothesis that you possessed jar A, in fact.

There are, I believe, four conclusion to draw from this: the first is that indeed, extraordinary claims require extraordinary evidence – no experiment occurs in isolation; there are always reasons for expecting certain outcomes, and those reasons can be very good. So, in order to overthrow a well-confirmed theory, it does require a substantial weight of evidence; more, in particular, than it takes to re-confirm the theory, or to decide between two theories that are so far equally well supported. There is thus no double standard involved in demanding a higher level of scrutiny when it comes to highly unexpected data (as those that accuse scientists of ‘ignoring’ certain data sometimes are wont to claim); in fact, it’s the rational thing to do.

The second, equally important one is that two people can be confronted with the same data, can act equally rationally, and nevertheless come to different judgments. This is true in particular in debates: arguments you find entirely convincing won’t necessarily convince your opponent, and this does not mean that he is obstinate, refuses to accept your logic, or is just unwilling to give in – he may be acting just as rationally as you are, just coming at it from a different point of view, so to speak.

The third is that you must never be too certain in your judgments. For, from the considerations above, it becomes immediately clear that you can never be convinced of anything you assign a probability of 0 to, even if it is true. This is in a nutshell what lurks behind things like conspiracy theories or other cases where people seem just incorrigible: once they have brought their minds to some conclusion with absolute certainty, no amount of contrary evidence will ever suffice for them to change it. This is known as Cromwell’s rule, in reference to Cromwell’s appellation to the Church of Scotland:

The fourth, and perhaps most surprising, conclusion I would like to draw is that no two persons, if they both act equally rationally, can ever be in total agreement. The reason for this is simple: two persons, if they are, in fact, different persons, will bring different experiences, different amounts of knowledge, different judgments etc. into any given situation. This, however, means that in general, they will assign different priors – which, as we have seen, leads to different judgments even in the face of identical data. This difference is persistent: for, while you of course can iterate the updating process (using a previously generated posterior as a new prior), and this iteration converges in the limit to an assignment of the true probabilities to any given event, in finitely many steps, two people, starting out with two different priors, will not arrive at the same posteriors.

This, I think, is a somewhat underappreciated snag in any ‘search for truth’, be it scientific or philosophical. It’s enough to have different persons – for only identical persons will have identical priors, those having been generated by identical experiences – to have persistent disagreements for all finite times. So when next somebody just refuses to see your point, or to be convinced by your arguments, perhaps consider that it doesn’t (necessarily) mean they are stupid, obstinate, or delusional – they’re just not you. (Of course, they still may be stupid, obstinate, and delusional anyway…)

This has gotten a bit longer than anticipated; thanks for anybody who persisted to this point.

What I’d like to debate are questions like the following: How should we best act, given that we may never agree on certain things? Is all debate pointless? What do we do in the face of incorrigible people? As they can’t be convinced, it would seem futile to engage them in debate. But if they receive no opposition, their point of view may proliferate unhindered. Should we let them persist, or oppose their viewpoints so as not to give them undue weight? And how does one best agree to disagree – i.e. acknowledge that what suffices to convince me, may not convince you, without this necessarily meaning that either of us is wrong?

Couldn’t sleep, huh?


Debate them, anyway. It can be entertaining and it’s really the audience you’re trying to win over, not the opponent.

Like many memes, the one about extraordinary claims sounds cute, but doesn’t stand up to close scrutiny. The scientific process works pretty much the same way for all inquiry. And, after all, there is no objective way to determine which claims are ordinary and which carry the tag: “extra”.

This is probably the most cogent explication of the “Search for Truth” I have encountered in quite a while. Thanks!

I usually assume that an OP is bad if it is that long, but I read it anyway instead of clicking back and it was interesting. Addressing the questions:

  1. Debate is pointless if you are hoping that the debaters will come to an agreed upon conclusion. That is not the purpose of debating to me at least. Debating is for the audience to judge the argument for their own opinions. I lurked on the straightdope for years to read opinions. My favorite section of a newspaper is the opinion section. I am simply happy to read people’s opinions and learn where they get their ideas from. They sometimes convince me due to the strength of their argument.

  2. I think incorrigible people should declare that they are not going to be convinced while debating. I try to do that when I take a position that, due to Bayesian influence, is extremely unlikely to be changed. It’s best to be explicit even though it probably is not necessary for some debaters on some topics. You know from experience with people whether they can be changed about a specific topic.

  3. If somebody can’t convince me of their opinion and I am feeling incorrigible because of Bayesian influence then why not presume the other person (or to be specific, person’s opinion) is wrong? Why would I even log in and start typing if I didn’t have a strong enough factual grasp of a matter to be able to state that an opinion is wrong?

I’m not sure I buy that. Claims of particles traveling faster than c seem to be objectively extraordinary–they defy currently understood laws of physics which have been repeatedly confirmed.

Likewise a claim that drinking diet soda may lead to obesity seems to be objectively ordinary. People drink diet soda in known quantities and obesity is a recognized medical issue.

And while I agree with your point about scientific inquiry in general, the OP–as I read it–is about debate. True, scientific inquiry involves debate, but the question: “Is debate pointless?” isn’t specifically about research or the scientific process, as such, and I believe that many claims that are subject to frequent debate can be objectively defined as being extraordinary; if only in common sense terms.

The second and fourth points seem to be just the same.

While it is very likely that we will get disagreement, it is not really ok to say that evidence in science can be treated just like other items on debates. Consensus in science is not the same as consensus in politics, for example.

-Feynman, on Cargo Cult Science.

Or like Feynman also said, “It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with the experiment, it’s wrong.”

In other words, where science and pseudo science are concerned you are entitled to your opinions, but not the facts. Disagreements will be had in the popular arena or press, but if your ideas do not work, denying the fact that they do not work only can lead the experts to ignore you and brand you a denier.

Thing is, if your pseudoscience is at all well defined, then it isn’t falsifiable.

Well, sure, but there is fuzzy astrology and there is the one that pretends to be scientific, astrologers are fond to give predictions that are all over the map and those predictions are therefore unfalsifiable, but there are cases where astrology goes beyond that traditional fussiness and some astrologers claim that they can be scientific, it is in those cases where researchers have gone again and again to report that astrology is poppycock.


I am a proponent of ordinary evidence and going with the weight of ordinary evidence. It is a very good start.

It’s true that two people with different priors given the same evidence will still have different posteriors. But their posterior probabilities will be closer to each other than their priors were. So the more evidence they share, the closer and closer they’ll come to complete agreement. This is the basis of Aumann’s Agreement Theorem.

Perhaps, but they can be close enough for government work. If I assess the probability of the moon landing having been faked as 1 in ten million; and you assess it at 1 in 20 million, it’s reasonable to say that we are in agreement that with near certainty we really did go to the Moon.

A “claim” that is already within the consensus is no claim at all-- it is neither extraordinary nor ordinary. And if we find what we think are particles traveling faster than c, it will be “ordinary” evidence that proves it. Ordinary in the sense that it is no different from any evidence that is meant to prove a claim not part of the consensus today.

Which then fails your stated test:

Which is part of the subject matter of the OP.

Having some rational expectation of what the outcome of an experiment might be is an important consideration in the scientific process.

This is what I do in practice, but I’m always a little uncomfortable with the impression of ‘putting on a show’ this carries – feels too much like an election debate.

I don’t mean ‘extraordinary’ in the sense of ‘special’, just that it takes more to be convinced in certain cases, and for good reason – a strongly supported theory takes more to be refuted than some idle belief, and that’s how it should be.

Thanks! Yes, I do think it’s a revealing way to look at these things, too.

Thanks! And I was never good at this whole ‘brevity is the soul of wit’-thing…

I’ll have to look at your points in some detail when I’m not on a cell, but one issue is that I think incorrigible people often don’t realise they are – we all tend to think (justifiably, I think, in more cases than one realises) that we hold our opinions because they are the rationally right opinions to hold, and surely if others just saw things the way we do, they’d have to agree eventually…

I don’t think they are; one is about realizing that rational disagreement is possible, which I think lots of people think they do, but don’t, really, and the other is about the idea that ultimately, the notion that there is some final, universal truth out there to discover is not the right guide for scientific, philosophical and similar endeavours.

Yes, that’s true, of course, and it’s not what I’m suggesting. But evidence is judged on the basis of prior knowledge in every case, and I think that, and the fact that it’s a feature, not a bug (otherwise, any state of knowledge would be unstable towards statistical fluctuations), ought to be more readily appreciated.

Well, I guess my point was more of an ‘in-principle’ kind of thing, as regards the larger issue of finding out some sort of ‘ultimate truth’ – you can reach FAPP agreement on many issues quite quickly, perhaps, but doing so for all issues is impossible in principle. There’s also the issue of degenerate priors, which we may not know we assign (indeed, such ‘blind spots’ seem to me to be an interesting issue in themselves – what if we start out, for whatever reason, and assign zero likelihood to something? We couldn’t ever discover our fault. That seems of course contrived, and it’s well possible that either truly degenerate priors are impossible to apply, or we’re not limited to Bayesian inferences, but it’s an interesting thing to think about.)

Perhaps more cogently stated: for any given finite amount of data, there exist priors such that those who hold them can rationally disagree on the data’s interpretation.

Doesn’t this border on tautology? People can rationally disagree–especially when interpretation is involved–regardless of their priors. Similarly, they could hold identical priors and still disagree on interpretation.

If I’m following your mathematical analysis in the OP, my second comment above wouldn’t seem to have the same objective weight as your equations.

But if we consider a claim that is completely ordinary, and we find data that contradicts that claim, it seems equally rational to consider the data to be flawed as it is to consider the data to be extraordinary, even given priors which agree that the claim is ordinary.

I’m afraid that is Zeno’s paradox with more features. :slight_smile:

The point here to me is that while one can say that in principle that there can never be agreement on something, the realty is that many can reach an agreement in real life or get close to a super majority.

Achilles does catch the turtle in real life anyhow.