Why do you considered that to be unreasonable? If I had you find a 2cm coin buried in a 10m x 10m field and had a metal detector you’d have much less than 2% chance of finding it by chance, but I wager anyone modestly skilled with a metal detector could find it without much trouble at all.
Why do you consider that to be unreasonable? The expectancy is actually less than what dowsers claim. This isn’t a test of being able to find things at random. It is a test of people’s claims. They were given quite a hefty margin of error IMO.
For the sample sizes we are dealing with it certainly is. If you had thousands of test subjects (and the money to test them) and came up with x2 chance you might have something, but one test showing a spike is nothing more than noise.
Statements like this just put my jaw on the floor. If the ability actually works the lower the rate of chance the more impressive the feat once performed. You’d think that drowsers would be begging for harder tests to prove themselves against not whining that they have only a 2% chance. In fact if the ability does work erratically (say 50% of the time) the lower the odds of chance the easier it would be to prove.
For instance if you devised a test that had a 50% chance and you got it right 75% of the time nobody would be impressed. But if it was a 2% and you hit 50% people would sit up and take notice.
That just means that metal detectors are accurate to that level. It doesn’t mean that anything which is not accurate to that level is bogus.
[In addition, it’s worth staying clear on the numbers. Any one peg had a 2% random chance of being on target. Having 2/3 of 9 pegs (the requirement was 10 to 100) be on target is not remotely close to 2% - it’s virtually zero. Though again, it’s hard to model, since the locations are not independent.]
It’s a valid test of these people’s claims. It’s not a valid test of dowsing.
A lot of scientific issues are tested at levels of success that are far lower than what Randi tests for. When potential drug or other medical treatments are being tested, no one insists that they be proved to be efficient at these rates. As long as the efficacy is shown to be non-random at a level higher than the confidence level, that is considered to be a successful test (subject to replication, of course).
If Randi just wanted to settle whether or not these phenomena have anything to them, he could easily design a test that would test for lower levels of correlation with a high degree of accuracy. But that’s not what he’s after, of course.
Not only do they agree, (and this is directed primarily at Fotheringay-Phipps), but they are always asked to demonstrate their claimed abilities before the test and before being blinded. They are always successful, and they always agree that the test is fair. Then they repeat the test with only one difference – they do not know where the object is that they just found so easily a minute ago. Success becomes failure. Doesn’t that suggest to you what is really happening?
It is a cardinal principle of science that if you have N variables in a test, and you alter only one and the outcome is affected significantly, that one item is most likely the determining factor for the result. In our examples here, the one factor is the foreknowledge of where the object is (or the fact that water is found most everywhere). Remove this, and the outcome reverts to pure chance.
For the record, I hope everyone is aware that James Randi does not participate in any of the modern tests for the MDC (I think he might have for earlier ones) nor is he present at the test site. He even asks that the organizers not inform him of when the test will take place to avoid any suspicion of bias (“my powers don’t work if a skeptic is present”).
And furthermore, James Randi does not claim that the tests proove anything (although you can draw your own conclusions). It’s a Challenge. I challenge you to do this, and if you do, I will give you a million dollars. It’s a dare. It’s a double-dare. Put up or shut up. One success, should it ever happen, is not sufficient to rewrite any of the known laws of science (although it would be if replicated sufficiently).
Here’s an interesting note- Randi actually verified the ability of one dude who Randi doubted.
wiki:In 1982, Randi verified the abilities of Arthur Lintgen, a Philadelphia physician who is able to determine the classical music recorded on a vinyl LP solely by examining the grooves on the record. However, Lintgen does not claim to have any paranormal ability, merely knowledge of the way that the grooves form patterns on particular recordings.[49]
Fotheringay-Phipps does raise some reasonable points. There is certainly the possibility that a prediction could occur at less than 50% of the time and still be interesting if that percentage was more than chance. He is also right that lots of such examples exist in nature.
That said, there are ways, presumably known to Randy, to design the test so that it can detects small effects. However in order to design such an experiment it needs to be known before hand how large an effect are you trying to measure. This is where the negotiation with the petitioner comes in. If the petitioner says he has a 85% accuracy (vs 5% as chance), then you and Randy will design the test so that if there really is a 85% accuracy the petitioner will pass it say 90% of the time.
On the other side of the equation is the probability that a petitioner will pass the test even if he is no better than random. For most application this is set to something like 5%. But Randy presumably uses a much lower value say 10^-8. This may seem unfair (when compared to the 10% failure rate for the petitioner), but you have to remember that Randy is seeing thousands of petitioners, and having even one of them succeed by chance would cost him $1,000,000 and have the potential to discredit the skeptic movement.
With this design, if a person comes in with a real psychic predictive ability of say 30% (still above the 5% by chance). He will probably fail the test, even though he has real powers. This is not Randy’s fault. Randy powered the test to predict 85% accuracy not to predict 50% accuracy. If the petitioner only had this lower level of accuracy than he should have told Randy before hand so that Randy could design a suitable test before hand to test that. Coming in after the fact and saying that my powers observed powers were accurate 30% of the time which was significantly better than chance (p=0.016) is too little too late.
**
Disclaimer: All the numbers in the above are made-up for illustrative purposes, and I actually have no idea what numbers Randy uses. I am relying more on my what makes sense from my experience in statistical trial design, the fact that Randy uses statisticians, the complaints that are filed against Randy, and the possibly unwarented assumption that Randy is likely acting in good faith.**
If the outcome reverts to pure chance, I agree with you. But that’s not what’s being measured. If Randi has kept some stats on the outcomes and can shows that the numbers are pure chance, then I am not aware of it, and as above, the nature of his tests does not suggest that he’s measuring this. Perhaps someone else has some additional knowledge here.
You test the abilities of people who claim to have these abilities.
If I want to know whether there’s anything to dowsing, I want to see a test that measures whether these people can get it right to a level that is beyond pure chance, at a statistically significant level. Like the test of any drug or medical procedure or any other scientific phenomena.
The fact that Randi can - by offering a big prize for a high-accuracy test - produce a self-selected group of overconfident people whose exagerated claims he can debunk doesn’t add a lot in this regard.
Actually, the challenge was well established at that point, but it was just for $10,000. Lintgen was never tested under the auspices of the paranormal challenge as he claimed no paranormal ability, just a remarkable skill. Randi was simply one of the best people to see if the skill was legitimate or was done by some form of trickery. It turned out to be the former.
The first question is whether in fact Randi would allow for a test that would test the guy’s 30% ability. My impression is that his tests tend to be oriented towards the upper end. And while it would be possible to design a test that would test for lower levels, it would be hard to get the kind of astronomical odds that Randi required in his dowsing example, which would tend to suggest that he wouldn’t allow such tests.
More importantly, my point here - as I’ve been saying since my first post - is not about the testee and whether it’s fair that he failed the test. It’s absolutely fair, and he does not deserve to get the money or to whine about not getting it. The point is about the rest of us, who are not going to get the money anyway, and whose only interest in it what the test tells us about the phenomenon being tested.
And here’s where I think Randi is being less than honest, in holding out to the world at large that he has made repeated tests that have failed to show paranormal phenomena, when in reality he has made repeated test that have debunked the specific outlandish claims of certain self-selected practitioners (self-selected largely by their willingness to submit to tests that tested for outlandish claims).
It’s evidence outside of the test. It’s like saying “tests of two different phenomena were equally inconclusive but in one case we have outside reasons to believe it’s valid while in the other case we have outside reasons to believe it’s not”.
Personally, I think it’s more likely than not that dowsing is completely bogus. But that doesn’t mean that Randi’s challenge plays a part in this.
What is the meaningful difference between “claims of the existence paranormal phenomena” and “specific outlandish claims of self-selected practitioners”?
I actually thought about that when I wrote it but realized that I couldn’t think of the next number after billions.
Truth is that Randi is probably not the best guy to do these tests, if only because most people are not likely to want to be tested by a guy who makes a name for himself by publically proclaiming that they are frauds. But that’s not the issue here.
My point was not about the fact that the participants are the only ones willing to be tested. My point was that these are the only people willing to be tested at that level. IOW, if Randi designs the tests such that you could win the million by merely being above chance a statistically significant percentage of the time, he might get people with lesser but nonetheless significant claims, that might be tested. But if he is only willing to give the prize to people who can get it right to an extremely high degree, then he will only get contestants with outlandish claims, who he can have fun debunking but are less meaningful for the issue itself.
An example of the first would be “dowsing works more than chance a statistically significant percentage of the time”. An example of the second is “dowsing works X percent of the time”. The first can be true even if the second is false (for many values of X).