Prob/Stat gurus - methodological problem?

There is a reason why I’m posing this question, but the situation is quite complicated, so I’ll leave my motivation for a later time. For now, here’s the scenario:

Each of 100 people is randomly assigned a number from 1 to 100, and this drawing process determines the order in which the people involved can choose their apartments (#1 picks first, then #2, and so on). The apartments range from being in excellent condition to being in extremely poor condition, so those with lower numbers will most likely get the best apartments. After all 100 people have been paired with a number, however, approximately 20 of these people are crossed off the list because they never even needed an apartment in the first place. Because these people do not need to choose an apartment, then, they are merely skipped in the apartment-selection process – a non-apartment-needing person with #5, for example, will be skipped over, and the person with #6 will actually be the fifth to choose an apartment (assuming that no other non-needers had lower numbers). Now, it seems rather absurd to me that people who never needed an apartment would be paired with numbers anyway; I would think that the non-apartment-needers should be excluded from the original random drawing (so that each of 80 people would be randomly assigned a number from 1-80 instead). My question is, though: even though it seems silly, is there really a methodological problem in the first scenario? Does a person’s chance of getting any one number increase if the non-needers are excluded from the original process, even though they are still skipped over in the first scenario? Does the violation of independence render the random assignment process fundamentally flawed?

Any explanations/conclusions would be MUCH appreciated. There is a method to my madness!


I don’t think that statistically there is a problem, because if 20 people are eliminated after the first drawing, there’s no real effect on the outcome. ie - If you drew six and a non-apartment needer drew 5, and was subsequently eliminated from the running, then, effectively, you become #5, and numbers 80-100 are essentially unused. The probability remains 1/80 of choosing a particular number. (You can’t just look at the experiements design when compiling your statistics, you have to look at the actual execution)

I’m curious to know why the first method would be used though.

What I mean is that the original numbers your probability calculations should be based on are 1/80, not 1/100.

Assigning 80 people a different random number from 1 to 80 is the same as

Assigning 80 people a different random number from 1 to 100 is the same as

Assigning 100 people a different random number from 1 to 100 and discarding a predetermined 20 people.

When I was in college, there was a scenario which was like this except that, for example, 30 of the people would be in the drawing just in case they got a pick in the top 1/3. Discarding 20 people selected to have poor picks is a different probability distribution. This is not the scenario as stated, but I wouldn’t be surprised if that might happen in a real world situation.