How many people needed to have a 50% chance of the same birthday?

I know that if you have a group of 30 people, there is a better than 50% chance that two of them will have the same birthday – month and day, that is.

How many people would have to be in your group to have a better than 50% chance that two of them were born on the same day – month, day and year?

I assume that the “30 people = likely 2 have same birthday” probability calculation is based on the idea that being born on any day other than February 29 is as likely as being born on any other day except February 29. But of course the chance that two people in your group were born in 1915 is way lower than, say, two people born in 1990. But honestly, I don’t know how much that complicates the probability determination.

My husband works for a company that has about 100 employees. Two are exactly the same age. Is the calculation of the probability of this occurrence simplified by the fact that no employee is under 16 years old, and few are above 65 years old?

(It occurred to me after I posted that I should have titled the post: How many people are needed to have a 50% chance of two with the same date of birth?)

IIRC, 23.

I did the math once; I’ll see if I can find it.

Yeah, 23 people has a 50.7% chance, ignoring leap days.

Thanks, but that didn’t answer my question.

How many people are needed for there to be a 50% chance that two of them have the same date of birth – month, day AND year?

I think you’d have to know the range of years in which they were born to determine that. Without that, you could assume a range of [current year - 100] to [current year], but that would almost certainly be to great a span.

It is definitely complicated by the distribution of ages in that range. Assuming that the employee ages are uniformly distributed between 16 and 65, then there are about 17897 possible days on which they could have been born, and if there are N employees, the probability that they all have different birth dates is:

(17897/17897)(17896/17897)(17895/17897) … (17897 - N + 1)/17897 = 17897!/((17897-N)!*17897^N)

And the probably that there exists two people with the same birth date is naturally 1 - that.

I show that crossing even odds at 158 employees.

Of course, for a real company, the distribution of ages will be skewed towards the middle (there will likely be a lot more 30-50 year olds than 16 year olds or 65 year olds, depending of course on the type of company), but that will just make collisions more likely and the even odds point at a lower employee count.

Given an age span of 50 years (assume 12 leap years), there are 50 x 365 + 12 (leap years) possible birthdays = 18,262. The probability that two people out of 100 are born on the same day is

1.0758336%

(One explanation here)

Using an iterative method I get 5,351 people to get to 50%. (I do not know a formulaic method.)

Edit: That to get at least 2 people who share a birthday, not exactly 2 people.

I believe I was off by 1 in the age span. If I use leahcim’s number of 17,897 (49 years instead of 50) then I get 1.0977141% for 100 people, and 5,244 needed to get to 50%.

I do not know why my number is so much different than leahcim’s. I used the same formula shown in that post, implemented in Excel. For 158 people I get 1.7412499%.

Assuming the age range is inclusive, then 16-65 is a span of 50 years, so you were right on that.

But your number of people to get to 50% is way off. Without seeing your formula, I can’t say what’s wrong with it, but leahcim is correct (except 160 people assuming a 50-year span).

True enough.

I put up a Google sheet showing my calc at https://drive.google.com/open?id=1bQtluQT8IFm0hcIYmV8GmYLdBKgmB_RnCUO_5NiSMMI

Note that all of these calculations assume that the odds of birth are the same for all 365/366 days of the year.

But, in fact, that is not accurate. There are specific days that are more & less likely as birthdays than the average. (And I would presume that the days vary significantly across cultures.)

So all this is true only on a generalized, average level.

Any difference from uniformity would make a collision more likely, so at very least it overestimates some. I think most people are shocked by how low the number is, not at how high it is, and thus any correction from uniformity would only make the reality of the situation even more amazing.

Funnily enough I went to look at the distribution of birthdays to see how it differed from a uniform distribution. And found a web page that was setup because of discussion on alt.fan.cecil-adms.
http://www.panix.com/~murphy/bday.html

Apropos of nothing I share a birthday with twickster. That’s day, month, and year. Also in our group are two celebrities of sorts. Hermann Tilke, a German race care driver and engineer, and Alexander Salmond, Scottish politician.

If there are x instances of y equally likely birthdays then the chance that no two have the same birthday is approximately
(1 - x/(y+y))^(x-1)
(This is NOT the correct formula — the correct formula was already given by leahcim — but a formula more accessible on most calculators which will give a close approximation.)

For OP’s problem there are 50 years worth of possible birthdays, but instead of y = 50365, I’d use 40365 to fudge for a non-uniform distribution.

We needed to schedule a Caesarean and were “stuck” with a Thursday — the surgeon’s Friday deliveries had been reserved months in advance!

I thought it was 26 randomly-selected people gives you a 50% chance of any two of them sharing a birthday. That is from memory from my statistics class, I’m sure a search will quickly find the magic number.

Note that you would need 367 people to guarantee having two sharing the same birthday.

If 365 possible birthdays requires 23 people to have a > 50% chance of a double up, that means you need 6.3% of the total set (6.3% of 365, is 23), to get a double up.

If the “6.3% logic” holds (and I’m not sure if it does), then, from a set of 17,897 birthdays, you’d need somewhere near 1128 people to get to a 50% chance of a double up.

I can vouch for the “across cultures” part; our local hospital had dimensioned ObGyn shifts by looking at peak dates for Spain (peaks in March and September, matching “summer vacation” and “Christmas”) and the docs got an article out of showing that we had more of a several-months plateau (March through July, matching a more-varied summer calendar). There are also years which see more births than others: Spanish natality for 2015 was 36% lower than for our 1968 peak (I found information for the US but it included a lot of years with estimated data).

KellyCriterion writes:

> If the “6.3% logic” holds (and I’m not sure if it does), then, from a set of 17,897
> birthdays, you’d need somewhere near 1128 people to get to a 50% chance of a
> double up.

No, that’s not how it works. Here’s a website that calculates probabilities like the birthday problem. I put .5 in for p and 17,897 in for n (in the lower of the two calculation locations on this website) and got 158:

https://lazycackle.com/Probability_of_repeated_event_online_calculator__birthday_problem_.html

An approximate way to calculate the answer is to take the square root of the number of possible values. The square root of 17,897 is a little less than 134. The square root of 365 is a little more than 19. The true answer is vaguely something like 20% more than the square root of the number of possible values.

If you’re doing a TV sitcom, apparently all you need are 5 main cast members to get a match.