Help me to design an experiment

This isn’t a thread about Randi, it’s a thread about how to design some kind of test of someone who claims they can dowse as detailed by the OP. We’re not discussing a test by Randi.

Regarding your question, if someone claims they can dowse for water with 80% accuracy and working together we come up with a test and a series of trials that the testee agrees he can hit 80% on, and we further agree that in this test our definition of “success” is 80% and “failure” is less than 80%, then if he scores 30% he failed, by the terms of the test.

If he has got any abilities they are, at the least, far less accurate than he himself claimed.

If he doesn’t think he can hit 80% but instead 30% then by all means let’s design another test to let him hit that mark.

I do agree that if we ran a test where the expected random outcome is 10% and someone scores 30% even though they claimed they could do 80%, 30% is still greater than random chance and I’d be interested to see what happened, but I would not let the testee go around claiming they had “passed the test” since by the agreed-upon terms they did not.

If someone claims they’ve got a particular ability, they specify up front what that ability is and what they will demonstrate, and then they don’t do it, they haven’t passed that test. Period. They may have provided evidence of something interesting and given me the impetus to run further tests (heck, maybe that guy really does have dowsing powers but they’re just 30% accurate instead of 80%) but it’s silly to set a mark, fail to reach it and then claim success anyhow.

So back to a test. Earlier you said that you’d give a specific example which I took to mean a specific example of a test that you’d consider acceptable.

Now you just said that you don’t have any background in related fields and that anything you say would be wrong…so what makes you think that the theoretical test I outlined is wrong or invalid?

No, you’re changing what you said earlier. A score of 80 is required to pass this test, as was agreed to by the test-taker.

You can’t go around changing the terms after the fact: saying that the test-taker scored higher than you expected him to doesn’t make him a winner, it just makes him a loser who scored slightly higher than chance.

If a guy comes up to me and claims that he can jump 10 feet in the air, and I bet him he can’t, and then he jumps 4 feet in the air, that’s a nice jump, but 4 ain’t 10, and he’s a loser.

Yes, that is correct. It is indeed, advice about how to runj a test.

So, when someone says that it should be done such a way, because that’s how Randi always does it, it is appropriate to point out that Randi is stupid and a liar.

Randi’s way of conducting a test is not a good way. Full stop.

Missed the edit window:

As my earlier math made clear, when your claimed accuracy gets closer to the level of random chance the number of trials you have to do to get a statistically significant result goes up. Remember with a claimed 100% accuracy you’d only need 6 trials to hit the level where random chance would get that same result one time in a million, on average, whereas if you drop to a claimed accuracy of 20% then even doing 100 trials only gets you to a bit below one chance in a hundred.

The point of this is that if we design a test around a claimed accuracy of 80%, and the observed outcome is 30%, that 30% is far less significant over that number of trials than 80% would have been.

I’m not choosing to debate you because (1) this is GQ, (2) the questions you’re asking are not on point to the thread, (3) you still haven’t bothered to support your assertion as to why randi’s tests are flawed, and most importantly, (4) I simply apply your own standard to your assertions about statistics.

But what if someone with a maths degree says that 30% is statistically significant?

Let’s try an easy one. Significant at what level of confidence?

lets say 99%.

I’m told that’s “highly significant”

I certainly never said that. I may have missed it but I don’t see where anyone else said that.

Again, since you said that anything you say is undoubtedly wrong what weight should be given to your pronouncements as to why any particular methodology is incorrect?

But let’s leave that aside - I know you said that you don’t have any special background in math or science. That’s fine. I am not a professional or an expert in either area myself, although I have a good educational background in related fields. One doesn’t have to be an expert to have an informed opinion or to ask relevant questions.

So what do you think is wrong/bad/poorly designed about my suggested set of trials? Any questions about the math, how anything was calculated? Anything that you think can be done better?

Heh. Ask a social sciences statistician and they’ll say that an 80% confidence interval is significant, meaning that if a hundred blind monkeys took the test, 10 of them would score over 30. If that’s his proof of dowsing, I’m not impressed.

No, let’s not “say 99%.” You’re providing a specific example. What was the confidence interval used by the “person with a maths degree” to claim his result was statistically significant?

Not true. When I did studies (not, I hasten to add, on dowsing), I held out for at least .05. :slight_smile:

He’s not saying “let’s say,” He’s saying “what was said?” Otherwise, let’s say 20% is significant, and 40 blind monkeys would score over 30.

99%

By the way, the person in question was Arthur C. Clarke, who had a maths degree.
(and a few others)

Are you cleverer than he?

More importantly, I’m demonstrating that Peter Morris is trying to argue vehemently that we ought to listen to some statistician’s results without even knowing what that statistician determined, what he did, whether the test suited the data set, and so on…

Not enough information. What was the standard deviation. And cite that the CI was 99% please.

Well, at this precise moment, yes–since in his current condition, his “cleverness” is fairly minimal. More importantly, the person’s name is utterly irrelevant to whether they calculated statistical probabilities correctly.

Wow, that guy is smart. At 99% only one blind monkey would be expected to score over 30%. Maybe that blind monkey is a secret dowser.

Actually, give me the mean also.

Edit; actually, just link us to the data. Don’t really care what Clarke said; I’m sitting in front of a computer, I can do the calculation myself.

And you still dodge the question.

Probably important to get the n as well. Also, what test did he use to establish significance?