There are lots of situations where the actual odds of success are unknown or extremely difficult to determine. If you are trying for a success but all you have so far are failures, how many more attempts should you expect to try?
For example, I read that someone found money in the pocket of some clothes from Goodwill. Ever the optimist, I decide I’m going to strike it rich by going to my local Goodwill and search for money in the pockets of clothes there. After checking 1000 pockets, I have not found anything. What’s the minimum number of additional attempts I should expect to try before giving up in frustration?
Can I say that since I’ve had X failures, I should expect to try an additional Y number of attempts before expecting to find a success? Not necessarily that X+Y attempts will result in a success, but that after X failures it’s unlikely to find a success without an additional Y attempts.
This would not be an all-or-nothing result, and the risk/reward formula would depend on a great deal more information. Other factors necessary to take into account, if you want a mathematical formula, would be a) how much money will be in the eureka pocket, and b) what is your cost in time and effort to keep on looking.
I would also suggest, if you are keen to carry out this experiment, that you concentrate on yard sales, rather than thrift shops, where there is a better chance that the clothing has not been so thoroughly inspected prior to your enterprise. Just one of many unknown variables.
I’m just following up on your example, but whatever real world scenario you envision would still be subject to a plethora of unknown variables.
I work in reinsurance - a business that spends much time and effort in trying to establish the probability of remote event. I am however not an actuary so do not take this as gospel in the statistical details.
If you have literally **no idea **of the odds I don’t think you can arrive at an answer as to how much negative experimental data you should expect before success.
If you can estimate the return period of an event happening then every failure is telling you is something about the confidence level X at which you hold that your estimate is correct. Alternatively if you set the confidence level you will accept in advance it will tell you something about the odds of the event happening again.
In your example, say you estimate that 1/1000 pockets searched will contain money (containing *something *will be better odds - but restricting it to money will reduce the odds) then you can estimate how many pockets you will have to search to have a 95% chance (for example) of finding some money.
The problem with remote events is that they tend to have very long probability tails - the more remote the event probability the harder it is to estimate the “number of pockets”. We have had a number of years without a Cat 3+ Hurricane making landfall on the US mainland - assuming the odds have not changed (que global warming debate) how many more days/weeks/years should we expect before one does? Tricky stuff…
Depends a lot on how much money could reasonably be found. If you have good reason to believe that some donated coat or jacket contained $10,000 in a pocket, it might be worth thousands of search attempts.
If it’s just $10, though, it would make sense to give up after a few dozen tries.
I forgot to add - experiments like cards have no memory. Do not fall into the gamblers fallacy!
Say the chances of finding money is 1/1000. Having searched 1000 pockets and found nothing does **not **increase your odds of finding money in the next pocket you search. Similarly, once you have found money in a pocket, the chances of the next pocket you search has exactly the same chance of containing money as the last one, or indeed the first failure you had earlier.
Okay - the way I would try and tackle it would be to try to get data:
From dry cleaners, charity stores, recycling units etc of the total aggregate amount of cash found in a year.
2 Then you would need some data on the number of money finding incidents that occurred in that same year.
Now some data on the distribution of how much is found per finding incident - if $1 million is found per year but half of that is found in five large lucky finds you need to know that. Similarly if nothing over $100 is ever found that is significant.
Then you might want to estimate the amount of money that was in the clothes but never found and passes out of circulation (beginning to see how difficult this is?)
Lastly you need how much clothing passes through such outlets per year and how many pockets they contain on average.
You could then construct a probability/severity curve (severity being the amount of each cash find) distributing for probabilities between nil and one the number of pocket searches required to find x% of the total money they contained. From that you can calculate how many searches you would have to make to have a Y% confidence level of finding something.
I would think from a stats perspective, I could make a guess of the best possible odds after X failures in a row. It doesn’t mean that guess is correct, it just means it’s the best guess of what the best possible odds would be.
That is, if I have 1000 failures in a row, it’s highly unlikely that the odds for the system are 1/2. If this was a coin flip, it would be extremely unlikely to get 1000 heads in a row (although it would be possible). So then, what are the best odds for a scenario where 1000 failures in a row is not unusual? If the odds are 1/1000000, then 1000 failures in a row may be normal. What about 1/10000, 1/5000, 1/2000…? At some point, the odds of getting 1000 failures in a row goes from being normal to unusual. The odds at that point seem like the best ‘guess’ of the best odds from a stats standpoint.
If I have X failures in a row, what’s the lowest odds which would have that be an expected outcome?
There are three components - you can ask what are the odds, Z% chance, that would have a Y% confidence level of producing X events with an outcome of fail.
A somewhat related conundrum does have a workable solution: finding a place to live. You first determine how much time you want to spend on looking, i.e., how many houses you want to go look at. Then you look at n/e houses. After that, you take the first one that’s better than the best one you’ve seen so far.
As for the pocket money thing: if after checking a thousand pockets you haven’t found anything, then obviously someone else checked the pockets first so there’s no point continuing. But if you at least found a button or a gum wrapper or two then there is a chance you’ll find something more valuable eventually. Whether 1000 tries tells you anything meaningful depends on how common you think finds are going to be. If you think 1/100 and nothing after 1000 tries, you were probably wrong about the odds. If you think 1/10000 and nothing after 1000 tries, you haven’t established anything yet. Also note that if the odds are 1/1000 and you try 1000 pockets, the chance of finding something is still only 37%.
Another statistical factor to take into account: A disproportionate number of searches will be conducted by people who had previously found something, and therefore, a disproportionate number of finds will be made by people who had been previous finders. This would falsely lead us to believe that previous finders are “more likely” to find something than previous failures. The statistical probability of putting your hand in the pocket of a pair of thrift shop pants and finding change can be calculated from a relatively modest sample size, but once the trials move into the real world, an overwhelming majority of finds will be made by people who kept on looking.
In other words, no matter what the odds are, the successes will gravitate to those who persevered.
I believe that, given X failures, you can venture a statistical (probabalistic) estimate of the likelihood that your event will never happen. You can use this information to try again with a larger sample. Your statistical estimate may involve assumptions that don’t apply to your event, especially if you are quite in the dark about the event (that is, you have never observed it). (If you have no clue as to the likelihood of the event, you probably should not asume a normal distribution, for example.)
Still, this general sort of thing is what scientists do all the time. The history of the search for proton decay (or for magnetic monopoles) may be instructive. The search for Nessie, not so much.
Statistically speaking, there’s not much you can say about the probability of an event if you only have failures. At most you can say “I am Y% confident (typically 95%) that the chance of success is lower than X%”. .
Now, the decision of whether to keep looking boils down to : “Is the chance of finding something times the payoff if I do greater than the cost of looking?”. So if you give a value for the cost of looking and make an estimate of the payoff, then you could calculate what percentage of pockets would need to have money for the search to be worthwhile. Then, you could use statistics to tell you how many failures (without a success) would make you confident that the chance of success is too low to be worthwhile.
So you can do it, but you’d need to have an idea of how much money you’d find, and then decide how much it costs you to look.
This comes up regularly in scientific research. If the probability is low and the number of attempts is high, the system follows Poisson statistics. If you observe zero successes after a long string of N trials (doesn’t matter how many), then the 90% confidence level upper limit on the number of expected successes is 2.3. (For 95%, it’s 3.0).*
Thus, the 90% confidence level upper limit on the probability of success is 2.3/N. So, as N gets larger, you have a tighter and tighter limit on the probability.
In your money-in-the-pocket example, you actually know something else about the probability since you have one instance of success (even if not yours). But in the idealized example that I assume you were after, where you really know nothing about the chance of success, the above holds.
More generally, upper and lower limits on the expected number of successes after seeing any actual number of successes n (where above we took n=0) are tabulated in many places, as they are independent of the number of trials. To get the upper and lower limits on the probability, you divide the tabulated limits on the expected number of successes by the number of trials (as we did above: 2.3/N for the upper limit.)
Technical footnote: One can construct confidence intervals in various ways, potentially modifying these numbers. But the most common approach leads to the numbers here.
A few years ago an English charity was given a car to use as a prize to raise money. This charity normally used local fetes as a fundraising venue and looked for the best way to use the car.
What they came up with was this: At each event they sold tickets with a four figure number on them for £10. Some mathematically challenged organiser calculated that they would raise up to £10,000. The big mistake was that the winning number was advertised. Unfortunately, the car was won at the second event and raised only a couple of hundred.
I think your experience in reinsurance is biasing you, in that you’re looking at the problem backwards, in a way that’s more appropriate for reinsurance.
You’re treating it as a situation - like reinsurance - where the odds are known (within certain bounds) and now the question is what’s the likelihood of you finding success with X number of tries (or the confidence level for at least one success). But that’s not the question here.
The key question here is to what extent the failures to date should influence your estimate of the odds themselves.
You may be correct, if as mentioned above (quoted below) you know nothing about the chance of success.
I was using event based methodology as the specific example had a know successful event and is a universe for which statistics could, in theory, be unearthed.
The fact that the upper limit of probability for a 90% confidence level approaches 2.3/N (N being number of trials) is a useful learning point for me though - as Poisson distributions have a relatively “fat” tail which is exactly what I anticipated the money-in-pockets example to show.
That seems somewhat non-intuitive to me, but then, so do most things in stats. The probability seems much higher than I would have expected. If all I know is that I’ve had 1000 failures, having a probability of success of 2.3/1000 seems a lot better than I would have expected.
It’s the upper limit of the probability of success. After 1000 failures, it’s 90% certain that the probability of success is at best 2.3/1000. It could be 2.3/1000 or 2/100000 or 1/1000000000.
Ah. That makes sense. The best possible probability I would expect if there were 1000 failures in a row. I’m in the unlucky case of being in a scenario with odds of 2.3/1000.
But what about when 1000 isn’t the outlying 5-10%? If I’m randomly checking pockets, chances are I’m in a typical scenario. I’m not in the most unlucky or lucky scenario. I’m in a common scenario. So what is the best probability when 1000 failures in a row is typical? When 1000 failures is at the 50%? When 1000 failures is at the center of the bell curve instead of at the tails (note: I probably don’t know what I’m talking about here