In my first year stat book we are given the equation for figuring out the probability of success for a binomial experiment. One of the conditions however is that the probabilities must remain constant for each trial. How do you compute the probability of x number of successes when each success decreases the chances of another success happening

Why would this be true?

A quick example. Suppose you toss a coin three times. Say we’re looking for the probability that we will get exactly two heads. A plot of what the different outcomes would be is here:

TTT TTH THT HTT THH HTH HHT HHH

That’s every possible combination that we could get by tossing a coin three times.

The probability that each combination will occur is 1 in 8, since there are only 8 different combinations. However, in three of these eight the conditions we specified have occurred, thus success. So, the probability for success is 3/8.

The chance for success does not decrease each time because we are using a sample with replacement, meaning we don’t take away a coin each time we flip it, we put it back in, and the chances for the next flip are the same as the chances were for the first flip, however the binomial probability is as close a measure as we can get to the probability were we able to flip the coin an infinite number of times.

Does that make any sense?

In the case of non-constant probabilities of success, you can’t use the binomial distribution. In that case, the probability of k successes in n trials is a bit harder to calculate, and the formula’s a bit tougher to write up here. We do still have to assume independence, btw.

Suppose the probability of success on the ith trial is p[sub]i[/sub]. Let S = {1, 2, 3, …, n}, and let S[sub]k[/sub] denote the set of all k-membered subsets of S. Then the probability of success can be computed as follows:[ol][li]Start with p[sub]k[/sub] = 0. For each element s of S[sub]k[/sub], do steps 2 through 3.[*]Start with p[sub]s[/sub] = 1. For each i in s, multiply p[sub]s[/sub] by p[sub]i[/sub]. For each j not in s, multiply p[sub]s[/sub] by 1 - p[sub]j[/sub].[*]Add p[sub]s[/sub] to p[sub]k[/sub].[/ol]See? Told you it was unpleasant.[/li]

I leave it as an exercise to the reader to show that the sum over k from 0 to n of p[sub]k[/sub] is equal to 1.

You’ll find yourself on a slippery slope if the probability of each trial depends on the outcome of the trials that have gone before. Therein lies the basis for a statistical model of feedback, or cooperative processes.

The more I think about it, the more I’m sure that I’m not assuming independence. Strike that.

The type of decreasing probability problem I’m familiar with is drawing from a finite pool without replacement. For example, the probability of drawing a heart from a deck of cards, when each card drawn is not put back in the deck.

The first card drawn: 13/52 chance of success

Second: 12/51 chance of success

etc.

So, for n cards (n <= 13), drawn from the deck, the probability that they are all hearts is:

(13 * 12 * … * [14-n]) / (52 * 51 * … * [53 - n])

or (13!) * (52 - n)! /(13-n)! / (52!)

If you think about this, it’s the same problem as if you drew

all of the cards at once, and asked the probability that they were all hearts.

It all depends on how the successes affect future successes. This could be extremely complicated. But I’ll give you an easy example.

**Example:** Suppose we have a bag with R red balls and B blue balls. If I were to draw n balls at random from the bag, what is the probability that k of them are red? Clearly, each time a red ball is drawn, we have smaller chance of drawing another. After doing some figuring, the answer turns out to be P(R,B,n,k)=C(R,k)*C(B,n-k)/C(R+B,n), where C(x,y)=x!/(y!(x-y)!).

P.S. I just noticed **NE Texan** posted a similar response. Hopefully, our answers are consistent.

According to this site and what I remember from my textbook, in order to qualify as a binomial probability distribution the experiment must meet three requirements:

The trials would have to be independent. The outcome of your second trial could not be affected by the first, meaning that the sample would have to be performed with replacement, i.e. the sample for each trial would be the same.

This means if you were counting the toss of a coin, in order for it to fall under a binomial probability distribution you would have to replace the coin each time you tossed it. Same for a deck of cards, you would have to replace the cards already drawn before the next trial.

**XJETGIRLX**: Your description of the binomial distribution is accurate. However, there are many instances where one of those assumptions is violated, but we would still like to compute the probability of k successes in n trials.

What I posted earlier will work in general, no matter which of the three assumptions fails–even with all of them gone, you can still do what I did. In my post there was a tacit assumption that p[sub]i[/sub] is constant, but that doesn’t even need to hold.

You will find that if you make those three assumptions and follow my algorithm, the binomial distribution will pop out at you.

**ultrafilter** and others, maybe you can give me a little insight on a problem related to the OP.

Pleonast’s Problem:

Consider the repeated flipping of a **biased** coin (probability *p* for heads, *q* for tails, *p*+*q*=1). If I flip the coin *N* times, what is the probability of having at least *k* tails **in a row**, somewhere in the entire sequence. Thus, order matters.

Pleonast’s Solution (Recursive Summation):

P(N,k) = q^k + (N-k)*p*q^k - p*q^k*Sum(j=0 to N-k-1,P(j,k)).

(Simple Recursion):

P(N,k) = P(N-1,k) + p*q^k*(1-P(N-k-1,k)).

P(N,k) = 0 if N<k or N<=0 or k<0.

P(k,k) = q^k if k>0.

These solutions are correct, but I consider them inelegant (because of the recursion). I’m looking to find a non-recursive form. Any ideas?

Honestly, the recursive solutions are likely to be the most elegant ones. IME, the solutions to all but a few special types of recurrence relations end up being really ugly.

I’ll take a look at it a little later on.

To throw in terminology in case people are looking for it … the situation that **NE Texan** and **Pleonast** described is ‘sampling without replacement’ and leads to the hypergeometric distribution . And it might be possible to work for the OP if each success lowers probability by a constant amount. It’d at least look the same if you could pick the proper finite values to approximate it.

**ultrafilter**, your algorithm is probably the simplest approach, but do you know a way to formulate it for the case of the OP in which the probability changes depending on the number of successes? It seems to me it gets really ugly since the p[sub]i[/sub]'s are changing.

I tried to come up with a way to say it using p[sub]a[/sub] to represent the probability after *a* successes, but don’t have enough time to work it out to see if it becomes easier.

Here’s my approach in words, at least :

I’m trying to find P(k), the probability that we have k successes in n trials. p is a probability, q is (1-p).

If I wanted just two successes, I would take the probability of getting a success at s [p = p[sub]0[/sub]q[sub]0[/sub][sup]s-1[/sup]] multiplied by the probability of getting one success in the remaining trials [p = (n-s)p[sub]1[/sub]q[sub]1[/sub][sup]n-s-1[/sup]]. Then sum over s = 1 to n-k.

So for three or more, I take the prob. of getting a success at some i, multiplied by the case of getting two in the remaining trials, using s = i to n-k, etc.

There will be a common multiplier of p[sub]a[/sub] from a=0 to k-1, and possibly some common q[sub]a[/sub], but I’m not sure if it simplifies any more.

That is a problem, and I don’t see any way around it.

The hypergeometric distribution will not cover **Pleonast’s** problem. It counts the number of trials required for a total of k successes in a sequence of Bernoulli trials.

Sorry, I meant **Pleonast**’s first post in the thread; I should have been more specific.

And forget the other stuff I wrote up above, it’s not only confusing but wrong, since it calculates the probability of at least k successes; I was intending to go for exactly k.

I think that using the p[sub]a[/sub] terminology as I did, you’d be best to use some particular order, but it still amounts to the algorithm **ultrafilter** posted.

What if k = 0?

so far everyone’s answers have been helpful and confusing. I realize that it would be helpful of me to give the specific problem.

If you roll a six sided die 10 times what are the chances of getting a 1,2,3,4,5, and a six in any order? I solved the problem (I think) with about twenty lines of C++ code but I want to have a more general usage equation. I called it a binomial problem because you can have either a success (which decreases your chances of another success) or a failure (which does not change your odds).

p.s. the odds of success are 27.182%

That’s actually a problem better suited for the Poisson distribution, I think.

Poisson would likely be better if there were a lot of ‘sides’ to the dice (p small), but there are other ways.

If I’m interpreting this correctly, you can get the answer using the generalized form of the binomial distribution, the multinomial distribution.

Suppose there are m different possible results, each with probability p[sub]i[/sub], and p[sub]1[/sub] + … p[sub]m[/sub] = 1. Let N[sub]i[/sub] denote the number of results in the ith category after n[sub]1[/sub] + n[sub]2[/sub] + … n[sub]m[/sub] = n trials.

Then P(N[sub]1[/sub] = n[sub]1[/sub], N[sub]2[/sub] = n[sub]2[/sub], … N[sub]m[/sub] = n[sub]m[/sub]) = [ n! / (n[sub]1[/sub]!n[sub]2[/sub]!..n[sub]m[/sub]!) ] * p[sub]1[/sub][sup]n[sub]1[/sub][/sup]*p[sub]2[/sub][sup]n[sub]2[/sub][/sup]*…*p[sub]m[/sub][sup]n[sub]m[/sub][/sup]

However, this gives you the answer for an exact number, and you’d still need to do a bunch of calculations for the ‘at least’ value.

The easiest way I see to do it using this method is to calculate the failures (1-p).

In other words, P = 1- [ P(no 1s) + P(no 2s) + … P(no 6s) - (P(1=0, 2=0,else = 10) + P(1=0, 3=0, else = 10) + … +P(5=0, 6=0, else = 10) + P(1=0, 2=0, 3=0, else = 10) …)

Each ‘set’ of subtraction terms will be be identical so it’ll simplify, but I haven’t worked out the total number of terms. There’s probably a better way to do this, but I never learned much beyond two random variables.

I solved it using combinatorial methods. There are 6[sup]10[/sup] possible outcomes when you roll a 6-sided die 10 times. As a point of notation, we shall write (5,1,1,1,1) to mean that the rolls gave 5 of one number and 1 each of the other five; similarly for the other outcomes. Now we can get (5,1,1,1,1) with any of the six numbers corresponding to the ‘5’ position; for each of these the number of outcomes is 10C5*5! Similarly
(5,1,1,1,1,1) 6*10C5

*5!*

(4,2,1,1,1,1) 3010C4

(4,2,1,1,1,1) 30

*6C2*4!

(3,3,1,1,1,1) 15

*10C3*7C3

*4!*

(3,2,2,1,1,1) 6010C3

(3,2,2,1,1,1) 60

*7C2*5C2

*3!*

(2,2,2,2,1,1) 1510C2

(2,2,2,2,1,1) 15

*8C2*6C2

*4C2*2!

The total number of outcomes which include at least one of each number is thus 16,435,440 and the probability is 16,435,440/6[sup]10[/sup] = 0.272, as **caffeine_overdose** said.