Question About Bayesian Analysis

DMC · August 13, 2006, 5:43pm

Are you saying that false rape claims should come from the general population, instead of being applied as a correlation to valid rape claims? If so, I have no problem with changing the scenario to match that, even if I don’t agree that that is a given. Let me know and I’ll change the parameters accordingly.

ultrafilter · August 13, 2006, 5:51pm

Meeper Brown is only more credible if you have no other information about the two cases. And in any real-world scenario, that won’t happen.

psychloan · August 13, 2006, 6:10pm

Of course.

It makes a huge difference. Because there is a wild imbalance between the rates of inter-racial rape. As opposed to the rates of rape in general.

DMC · August 13, 2006, 6:13pm

Cool. Then we can change the parameters as as shown below.

2,964,000 white females
872,000 black females

1,398 actual white rape victims
790 actual black rape victims

I went with a general population false rape claim of .02%, which, while rounded off, maps pretty well with the mean of the extremes as noted above. So, with the change to false claims coming from the general population, we have:

1,991 white women who claimed rape
964 black women who claimed rape

593 white women out of 1,991 were lying
174 black women out of 964 were lying

Anything else you need for this?

psychloan · August 13, 2006, 6:26pm

Nope.

And under your example, a white woman’s claim of rape is actually a bit less credible than a black woman’s.

When you think about it, this makes sense. Because, putting aside the issue of inter-racial rape, black women are raped more frequently than white women.

DMC · August 13, 2006, 6:41pm

While I don’t agree that white women claiming rape in North Carolina are less credible, I also knew that the real numbers would skew it the other way, but quite a bit less so than the hypothetical you were basing your previous math on. I think that shows that you aren’t entering into this with any preconceived notions, thus allowing the debate to continue on the applicability of Bayesian Analysis without having to spend as much time defending the result itself.

you_with_the_face · August 13, 2006, 6:51pm

I’m just curious what kind of Bayesian math led you to this conclusion, psychloan.

Unless you have some data that none of us are privy to, this conclusion has no basis in Bayesian reasoning or any other kind of reasoning.

psychloan · August 13, 2006, 8:18pm

That’s an interesting question, but first – do you now accept my calculations? Do you accept that even if two groups are equally likely to lie, a claim by a member of one group might be more credible than the same claim made by a member of another group? Do you accept that there is not necessarily a contradiction there?

you_with_the_face · August 13, 2006, 8:37pm

Did you notice that I never disputed your calculations? My issue has always been your assumptions and the way you worded your conclusion. Never the math. I took one look at the algebra that you initially came up with and realized that you were making things a lot more convoluted than necessary.

Perhaps we are working off a different definition of credible? To say that someone is less credible is equivalent to me saying that are not are honest. That’s why I still believe that despite what your math says, you need to refine your conclusion so that it doesn’t contradict with your assumption.

I still believe that you (and others in the Pit thread) are simply commiting the creationist’s error of applying probability in reverse. You are looking at the odds that a given claimant was raped and using that to gauge their veracity, instead of just looking at the chances that they are telling the truth. Noel’s first post in this thread expressed this point pretty damn well, which is why I was troubled to see you dismiss it as you did.

RaftPeople · August 13, 2006, 8:43pm

kellner:

I hope I didn’t make too many mistakes, but here is my take:

We have 13333 reports. 25% of those are false, that leaves about 10000 true incidents, assuming that all incidents are reported.

Among those are
9900 fleeper-on-meeper and
100 meeper-on-fleeper incidents.

There are 3333 false reports and 1990000 people who haven’t been tickled. So the probability of someone who hasn’t been tickled reporting incorrectly is about 0.0017.

Therefore the
990100 meepers who haven’t been tickled file about 1658 false reports and the
999900 fleepers who haven’t been tickled file about 1675 false reports.

Meepers file 9900 true and 1658 false reports (=11558): about 14% are false.
Fleepers file 100 true and 1675 false reports (=1775): about 94% are false.

(please excuse my totally unsystematic rounding)

I know that DMC, psychloan and you with the face have been circling around this type of point (when I see the posts with actual statistics), but I just wanted to explicitly state something that is implied in the OP and this solution.

The assumption is that the reports of tickling are equally distributed between populations. This is a huge assumption and determines the outcome. By adjusting the ratio of reporting between populations, you could end up with any result you want, including almost all of the false reports from one population with almost zero reports from the other.

Unless I’m missing something (which I often do), this is a rather trivial example that didn’t get us any closer to understanding the “credibility” of the original statement.

ultrafilter · August 13, 2006, 8:56pm

To me, “credible” is a jargon term. I think psychloan is using it that way, but could have saved himself a lot of heartache by explaining that. The meaning ties in to another point I want to address, after another quote.

No, he’s not. The creationists are arguing that the low likelihood of an event’s occurence implies that it did not occur. This is the argument from incredulity, and is a well-known logical fallacy.

What we’re discussing here is the subjectivist interpretation of probability. The event either happened or it did not, but we don’t know which of those two alternatives is true. How much faith should we put in the claim that it did happen? That’s what’s being expressed as a probability here. In Bayesian jargon, what we’re discussing is the prior probability that the crime in question did happen. The degree of credibility of a claim is simply the prior probability that we assign to it, so to say that claim A is more credible than claim B is to say that claim A has the higher prior probability.

I’ve already pointed out the error in the simple Bayesian analysis a couple times, but since no one has acknowledged it, let me repeat it: we are not trying to determine whether a suspect attacked a victim; we are trying to determine whether this suspect attacked that victim. There is other evidence available, and that has to be taken into account.

I’d be more inclined to treat this as a classification problem, but that’s a separate discussion.

I don’t see how it affects the Bayesian analysis. Would you mind explaining?

RaftPeople · August 13, 2006, 9:02pm

If the distribution was 90/10, then out of 3333 false reports, only 333 were from the population that had 100 positives (can’t remember the name).

100/333 is a better accuracy rate than 100/(3333/2)

RaftPeople · August 13, 2006, 9:06pm

On re-reading, I made a little mistake in the 100/333. Should be 100/433 (pos+neg), but you get the idea

you_with_the_face · August 13, 2006, 9:11pm

Occur naturally, which is why they believe in God. But I’m with you here.

Which seems to me exactly what creationist believe. We are looking at an event that has already happened and looking at the odds that it happened, and using that to say how credible any given report of such is. How is that any different than saying that the odds of winning the lotto are small, therefore anyone who reports winning the lotto is less credible than someone who reports winning a door prize at a company party? Perhaps I’m missing something here.

I agree that this is valid objection. Since we don’t know anything about either claim, it’s best to reserve judgement until evidence comes out.

ultrafilter · August 13, 2006, 9:16pm

The creationists are insisting that the event in question did not happen because of the low likelihood. The subjectivists are merely expressing a low level of confidence in a claim because of the low likelihood, but allowing for the possibility that it did.

From a strict subjectivist viewpoint, the lottery winner’s claim is less credible than the door prize winner’s claim, in the absence of any other information. You have to read “credible” here exactly as I defined it above to understand their reasoning.

RaftPeople · August 13, 2006, 9:50pm

Here it is with the numbers actually worked out with a distribution other than the 50/50:

Distribution of fleeper-on-meeper reports=99%
Distribution of meeper-on-fleeper reports=1%

We have 13333 reports. 25% of those are false, that leaves about 10000 true incidents, assuming that all incidents are reported.

Among those are
9900 fleeper-on-meeper and
100 meeper-on-fleeper incidents.

Out of 3333 false reports, meepers file 99% = 3300
Out of 3333 false reports, fleepers file 1% = 33

Meepers file 9900 true and 3300 false reports (=13200): about 25% are false.
Fleepers file 100 true and 33 false reports (=133): about 25% are false.
I could run the numbers again and make the Fleepers more “credible” than the Meepers. Unless you can accurately supply the rate of reporting, then this analysis has not moved us closer to a conclusion regarding Bricker’s statement. I guess you could say that because he said something like “all things being equal” the OP fits because you would have to assume an equal reporting rate, but I don’t read that into his statement and I’m not sure arbitrarily choosing an equal reporting rate is any better than choosing any other reporting rate.

Noel_Prosequi · August 14, 2006, 4:37am

I have no doubt you can crunch numbers like a Banshee. But I am convinced you are wrong. I am not even prepared to go as far as ultrafilter when s/he says

For mine, probability analysis provides no valid way of determining the credibility of a specific complainant, whether or not further information is available.

Perhaps it is, as YWTF says, a matter of different meanings of credibility?

If so, can I suggest a way to test that.

Any real world, specific rape alleger will differ from her complainant and the population generally not only by race. It is possible to place the complainant into an almost infinite variety of binary divisions by which she might differ from her alleged attacker and/or the general population. Eye colour, hair length, voting habits all spring to mind. Even down to genes - let’s pick the D1S80 locus and say she is a 25 and her attacker is a 29. I think I’ve picked a locus that is race-independent, but if not, pick one that is.

In principle, then, it is possible to calculate the false allegation rate (and so on, as has already been done for race) across each of the newly chosen axes of division as they arise in the population. One then plugs these numbers into the Bayes Credibility Engine, and gets for each axis of division a new (and undoubtedly different) calculation of “credibility”.

How can the same, real woman have a different credibility depending on whether you consider whether she’s a D1S80 25 or a Republican? Every single complainant will have, in principle, a bazillion different “credibilities” depending on how you look at her. I venture to suggest that if you think this is sensible, you are using the word “credibility” in such a way that is so remote from its usual uses as to be meaningless. Conclusion? The Bayes Credibility Engine does not work.

RaftPeople · August 14, 2006, 5:14am

The only way this can be true is if we make the assumption that as the frequency of an event decreases, the percentage of false claims by humans of that event increases.

Has this been shown to be the case?

Polerius · August 14, 2006, 7:31am

Look at it this way:
Say you have two uncles, both of whom are equally honest and trustworthy.

One day, uncle Joe calls you and tells you that he saw a giraffe running down the street.

The same day, uncle Jack calls you and tells you that he saw Mikhail Gorbachev giving Alec Baldwin a haircut in the middle of the street while five monkeys were licking his balls.

Upon hearing the news from uncle Joe, you might think it’s a bit weird, but could happen, so you might question him a bit to get more info about what happened, but you would easily be convinced that it did happen.

Upon hearing the news from uncle Jack, it’s pretty clear that you would find it unbelievable, and even though you think he’s just as honest as uncle Joe, the sheer incredibility of what he is saying makes you want to ask a lot of questions and dig deeper into the subject than you did with uncle Joe.

It might turn out, after you collected some facts about the two cases, that in fact, uncle Joe was pulling your leg and uncle Jack was telling the truth.

But, prior to any other facts about the two specific cases arising, you will simply be much more skeptical of uncle Jack’s report than uncle Joe’s.

Being more skeptical about one report vs the other means that you consider one report more credible than the other (again, until further facts about the two specific cases arise), even though you consider both uncles to be equally honest.

I don’t think you need Bayesian analysis or any other math to see that this is the case. (Bayesian analysis simply formalizes and quantifies the disparity in “credibility” of the two reports)

Noel_Prosequi · August 14, 2006, 10:55am

Polerius:

Look at it this way:
Say you have two uncles, both of whom are equally honest and trustworthy.

One day, uncle Joe calls you and tells you that he saw a giraffe running down the street.

The same day, uncle Jack calls you and tells you that he saw Mikhail Gorbachev giving Alec Baldwin a haircut in the middle of the street while five monkeys were licking his balls.

SNIP

But, prior to any other facts about the two specific cases arising, you will simply be much more skeptical of uncle Jack’s report than uncle Joe’s.

Being more skeptical about one report vs the other means that you consider one report more credible than the other (again, until further facts about the two specific cases arise), even though you consider both uncles to be equally honest.

I don’t think you need Bayesian analysis or any other math to see that this is the case. (Bayesian analysis simply formalizes and quantifies the disparity in “credibility” of the two reports)

No problem with that. That is using legitimate reasoning based in inherent credibility issues internal to the stories to assess the truth. Not really necessary to talk about “probabilities” to do that. The Bayesian approach provides a number, but one which is so dependent on subjective assessments of probability that the illusion of numerical accuracy is more likely to mislead than illuminate.

What I (and I think others) object to is trying to do the maths with something completely irrelevant to an assessment of credibility of Joe or Jack; it is as if you tried to use the “deception history” of their age cohort, or of people in their respective ethnic backgrounds before even hearing the details of their story to determine which was more likely telling the truth.

By the way, I wish I could turn that Gorbachev thing of yours into a band name…

Topic		Replies	Views
Lets discuss Bayesian Statistics In My Humble Opinion	92	12085	April 20, 2012
Bricker is a disingenous punk. The BBQ Pit	613	20375	August 25, 2006
Should statistics be used to evaluate the veracity of an alleged victims claims? Great Debates	46	2139	April 19, 2006
Bayes Theorem and The Resurrection Great Debates	39	3288	May 29, 2002
Kobe Bryant charged with felony sexual assault Great Debates	129	4013	July 24, 2003

Question About Bayesian Analysis

Related topics