A few years ago, while I was working at Ticketmaster, I got a call from a house in Indianapolis. There was nothing special about this house at all - except for the fact that I lived there until I was 4 years old. Even though I wasn’t strictly allowed to say anything about this, I mentioned it to the customer. She was also floored.
Now, I’m trying to estimate the probability of this happening.
I’ve accounted for: number of houses where I lived in the market I served, number of households in the market I served (used U.S. census), and the number of orders I processed while I was working at Ticketmaster (estimated number of hours worked * about 15 per hour).
But there’s one more factor: percentage of households that used ticketmaster at least once, while I was on duty. Using the pulled-out-of-ass method, I came up with 20%.
The issue isn’t whether 20% is too high. It’s what to do with that 20%, or whatever the figure actually is. I’ve worked the figure in, so far, by multiplying the number of households by 0.2.
But I haven’t accounted for the probability that at least one of the two houses where I lived was in the ticketmaster using field at all. Obviously, if neither of them were using ticketmaster then, the probability would have been zero. So it seems that I have to use that 20% again… somehow.
Without accounting for this, I get a probability of 0.007. Do I need to multiply that figure by 0.36 (the probability that at least one of those houses used ticketmaster while I worked there)? Or is it (as I would guess) more complicated than that?
The chances of it happening to you are rather remote.
But consider the vast number of people making calls to marketers and other services, and the chances of it happening to someone not only approach one, but approach one so closely that, in fact, it likely happens to someone every day.
As the joke has it, “one in a million events happen 25 times a day in New York.”
Trinopus
Thanks - I do mean the probability of the event happening to me.
The 20% only needs to be accounted for once. There’s a 1 in five chance of any given household in market using Ticketmaster, using your parameters. That means there’s a 2 in five chance of one of your two former homes using the service.
Allowing your assumptions, I would say that your formula would look something like:
(houses lived in) * 1/(households in market) * (1 household using Ticketmaster)/(5 households) * (orders processed)
If you want something that will give you a better grasp of the odds on it actually happening, leave the “orders processed” parameter out. That will give you the probability of it happening with any given order.
I think I would want to know more about how you’re using the rest of the data before I even attempted to answer your question. For example, how are you using the info about the number of orders you’ve processed?
In the way I would model this problem, I wouldn’t use the “percentage of houses that used Ticketmaster” data at all. Instead, just let N be the number of houses in the market that you served, n the number of houses in that market that you also lived in, and c the number of calls that you’ve fielded. Then the probability that you field at least one call from a house where you’ve lived is 1-(1-n/N)^c.
That assumes, of course, that each house in the market is equally likely to be responsible for any one call. If you assume that only 20% are ever going to use Ticketmaster, then things become more complicated…maybe. The thing is, if it also true that only 20% of the houses that you’ve lived in are ever going to use Ticketmaster, then you just divide both n and N by 5 in the formula above…which doesn’t change the answer.
Nevertheless, there are probably multiple ways to model this question depending on the assumptions you make and the particular probability you’re calculating. How exactly are you performing the rest of the calculation?
(And you do realize that calculating the probability of an event that has already happened won’t tell you a heck of a lot, right?
)
Thanks - your point is well taken about how I should consider how much some houses used Ticketmaster, and not just the number of houses that did. If I’m gonna think about that at all, I should do it right.
Anyhow, my methods are almost exactly the same as Orbifold’s:
1-((N - n)/(N * 0.2))^c
N is 2,313,972, n is 2, and c is estimated to be 186,000 miles per… oh wait, I mean it’s close to 8400.
Sure… it’s partly a just how goofy was that? exercise and mostly a workout for flabby math muscles.
Small correction - I used
1-((N * 0.2 - n)/(N * 0.2))^c
And N is the total size of the market, 11,569,862.
:smack:
So with that calculation, you’re assuming that all n of the houses you lived are in the “uses Ticketmaster” group. In other words, all of the houses you lived in now fall in that 20% (or whatever the percentage is). If you’re OK with that assumption then your calculation would seem to be correct.