Randi/ where are the "hits"?

This is a 44 min video. I appreciate your responses, but is their a particular section that is relevant?

I will try to watch the whole thing, but it may be a while before I can.

I hesitate to try the math, but:
outcome of interest/ total possible outcomes = chance (I am purposefully trying NOT to use acedemic terminology.)

chance ^number of trials = likelihood of identical outcomes.

Assuming outcome-of-interest in this case is that someone DOES NOT find the Easter egg then chance = .9 (9 empty boxes/ 10 boxes)

If I don’t find the egg, who cares?
If you don’t find the egg, who cares?

But .9^7 = ~.4783

If 7 people try and none find the egg, it seems “Better-than-chance” that… in this case, maybe that there is no egg at all. (Because at .9^6 ~.5314 your chance is over what you would expect if you just flipped a coin.)

Now, maybe the major misunderstanding is right there in terms of “better than chance” and I’m guessing the misunderstanding is mine.

Also note that I’m NOT saying that this test with 7 people is statistically significant. In fact, my WAG is that it is not.

However, using some ex-oficio numbers we get:
.9^10 = ~.3487
.9^100 = ~… well, I don’t have a decent calculator right now, but I think at that point it would be statistically significant if no one had found the egg. I’m not sure what the real numbers are in terms of the exact point at which it becomes “Better-than-chance” or statistically significant, but by the time you get to 100 people trying, it would seem that you’d have to be well past that point… right?

SiXSwords, I’m still trying to understand what you are wrestling with, so forgive me if I don’t appear to be addressing your question, but I’ll try.

Let’s propose a protocol where a single object is placed, at random, underneath one of ten opaque boxes, numbered 1…10.

A person who does not know in advance tries to determine which box it is under. He comes up with a number from 1 to 10. All boxes are overturned and there are two possible outcomes:

  1. The box he chose contained the object (success), or
  2. The box he chose did not contain the object (failure).

This constitutes one test. The chances of anyone finding the object if by chance alone are 1 in 10.

If we are trying to prove that forces other than chance are at work, we need to do quite a few trials; one is not enough. If only chance is at work, the total successful tests will approach 10%; the more tests done, the closer we should be to that number and the more reliable our data set.

But, if other forces are at work, the total succesful tests should approach some other number, such as 80%. Statisticians can tell us how high the number needs to be and how many tests we need to run to get a significant result.

So, if we begin a series of tests, and in the first one, the object is found where predicted, is that statistically significant? No, we would expect that to happen once in ten tries anyway. The fact that it happened on the first one doesn’t mean anything, unless you are deliberately trying to show something that isn’t true by skewed data selection.

I think it’s clear what he’s wrestling with. Put it like this:

It is frequently stated by Randi supporters that the odds in one of his tests are in the region of 1000 to one in the preliminary test.

On the other hand, James Randi claims to have tested over a thousand dowsers, and presumably many other claims.
If a thousand people took a test at 1000-1 odds, chances are fairly high that somebody would have beaten those odds, simply by fluke. And it’s virtually certain that there would have been a few near misses.

The OP wants to know, where are the descriptions of those?

I think we need to define “test” here because as I used it in my previous post, it was a single “run” of having to determine, at one time only, which of 10 locations contained an object.

(I am using quote marks here just to emphasize our terms need to be carefully defined to avoid misinterpretation.)

None of the tests Randi has conducted consist of a single “run”, since this would prove nothing except chance. Each of those “tests” consisted of several “runs”, and enough “runs” have to be passed to make a successful “test”.

I’m absolutely sure no one has passed a “test” given by Randi – that is, one with multiple “runs”.

I have no doubt that a “run” here and there was guessed correctly, but since a million dollars is sometimes on the line, we need to be extra-sure. Extraordinary claims, etc.

For the OP: If you watch some of the Youtube videos of Randi’s TV adventures over many years, not all with dowsing for water, you will indeed find sometimes a contestant made a correct guess – is that what you’re looking for? However, as I have pointed out in detail, one correct guess isn’t enough to establish the existance of a parnormal phenomena. That guess is often accompanied by many wrong ones as part of the same “test”; taken as a whole, the contestant performed closer to chance than to skill.

The point is that on such a test, the average score would be 10%. But some people would score 20% and some 30%, simply by fluke. The probability distribution curve shows this as inevitably happening sometimes,m simply by fluke. Agree so far?

However, Randi never tells you about the ones that score 20% or 30%. He only ever tells you about the ones that score exactly the average, or less.

All true, but you’re leaving out a pertinent bit: The dowsers themselves had specified what constituted a successful test. They didn’t go in saying “We can locate water at a rate slightly above average”, they specified that their abilities would lead to an 86% success rate, on average - not 22%. That was the ability they claimed to have.

They even had a dry run (heh) where they knew what pipe held water, and their abilities worked just fine. It was a perfectly fair test: The water dowsers were asked what they could do. Then asked to do it. And they couldn’t.

Feel free to call it a hit that merits further examination, but I’m not seeing where Randi is obliged do anything but calling it what it was: A failed test.

No, the rules of mathematics determine what constitutes a successful test. Dowsers don’t get to change that. Success is determined by statistical analysis of the results. If statisticians determine that 22% is a positive result, it doesn’t matter what the dowsers say.

He’s obliged to report the results fairly, without slanting them, and to comment knowledgeably on what the results mean.

That would be helpful, yes.

Every one I have seen have been very much like the one I linked to. The contestant makes a correct guess after three attempts. (If you watch the vid, BTW, you’ll see his second guess is the one he has to come back to…)

I do recognize the subtext in this and, if it helps, disregard the dowsing aspect.

Imagine I have a ten sided die with no special characteristics other than the uncanny ability to protect the user from dates.

If I were to use my trusty ten sider, I would probably have similar results to the contestant in the linked vid.

But, given many contestants all using ten sided dice, someone should hit first and then be retested… right?

I accept that I haven’t watched a sizeable portion of the videos out there, and if you can link to such an episode, I would think that would answer the OP.

At this point, the best answer I have is that that’s not entertaining, so that’s why you don’t see it.

Taken as a simple statistical test, “hitting” on the first try means nothing. It is chance.

I would like to see the various reactions and behaviors of all involved when it happens, regardless of it’s statistical significance.

Regardless of whether or not it is entertaining.

It’s a part of the test to get the paranormals to state unequivocally what it is they (think they) can do. It’s on them, not Randi, to describe their capabilities.

Yes, yes they do. The dowsers looked at the setup and said: “Our dowsing abilities enable us to perform this test with an 86% success rate.” That was the bet they took.

This is simply not true. The test was designed by the Australian skeptics and the dowsers themselves. The fail/no-fail criteria were agreed to beforehand.
Feel free to argue that another test should have been performed instead, but this was the protocol agreed upon. Who are you to re-write the conditions?

Randi is not doing basic research on dowsing, not does he have any obligations to do so. He’s in the business of telling people who say “I have paranormal abilities to do X” to put up or shut up. X, in this case, was an 86% success rate on water dowsing. The dowsers were wrong.

No, No, they don’t. Maths is maths. Neither James Randi nor a bunch of dowsers get to change the rules of maths. Just because they agreed with each other doesn’t make it right.

I agree 100% (Or maybe 95% :slight_smile: )

My statistic-fu is too weak to take this much farther, so I hope someone will step in, but I think what we need to do is prepare a curve showing, given pure chance, how many times you would expect a higher (or lower, don’t forget that) score on a test. Then we need to determine:

  1. How many test runs are available for analysis, and
  2. How many of those we would expect to get a higher score of, say 20%, or 30%, or all possibilities.

…then see if we have enough data to work with.

Do you see where I’m going with this? I’m trying to show that until you know what to expect, it is disingenous to say “I don’t see enough!” – how many do you expect, and is that mathematically supportable?

The best analogy I can think of on short notice is the Birthday Paradox…given a room full of people with random birthdays (days of the year), how likely is it that two share the same day? Most people figure 365/2, or about 180, but the midpoint point is around 23 people. That is, if you have more than 23 people in a room, the chance is greater than 50/50 that some two share a birthday.

So maybe the OP is expecting a greater number than is mathematically supported. Saying “it doesn’t look like enough” isn’t a very good way to approach this in a serious manner.

It’s kind of interesting that the OP was (more or less) “Just by chance, even without extraordinary abilities, some testees will score above average; where are the videos of those tests?” , and Peter Morris jumps in to say “Look, here’s an example where the testee scored above average. Therefore Randi must be wrong about the testee’s lack of extraordinary abilities.”

Frankly I’m not sure why anyone is lecturing SixSwords on probability; he seems to have a pretty good grasp of the basics.

The odds of my reading those words on these boards is roughly 3,720 to one!

I think - and feel free to disagree - that we should leave it up to the dowsers themselves to describe their capabilities. The test was not to discern a possibly statistically significant above-average performance. The test was to see if the dowsers could do what they said. Which they couldn’t.

It’s basic science: Predict, then test. Don’t predict, then test, then insist that a much lower success rate than predicted is a “hit”.

I did not say that, nor imply it. That is a gross distortion of my statement.

The question was, what Randi does when faced with an above average score. And the factual answer is that he adjusts the score in his own favour.

I have stated repeatedly that the result is likley due to fluke or procedural error, not dowsing skill.

I have never said that the dowsers’ claims are real.

See, that’s the problem with these, so called, tests of paranormal abilities based on statistics and probabilities. This type of maths is the wrong scalar to test paranormal events. It is not the right algebra.

I don’t know what you mean by ‘adjusting the score’, since the dowser (or other paranormal practitioner) agrees a specific claim about his abilities before the test starts.
If he fails to achieve this, it’s a failure.
For example, if someone claims he can predict the result of coin tosses 98% of the time, it doesn’t matter if he scores 52%.

Here is an example of randi testing a dowser as part of the preliminaries for the million dollars.

For a start, the fact that Randi broke the agreement, which he had previously agreed was fair. He added on the scores of two totally separate and unrelated tests. There was no provision for doing so in the agreed upon rules.

By the way, can we please stick to the question in the OP, which is what does Randi do with the hits.

If you run enough tests, there WILL be flukes and abnormally high scores. That is a certainty, its just the basic laws of probability in action. The question is what does Randi do when they happen? Please do stick to the subject.