Sorry, I think this comment should have been in response to your ultimate argument, rather than the particular paragraph I gave it in response to.
For a properly defined continuous probability distribution, 3) is true (probability is 1/3).
For a discrete probability distribution (or one that has any discrete values for which probability of selecting a particular value is non-zero), then neither 1), 2), nor 3) is in correct in general.
For an ill-defined distribution (like the one proposed here) you can’t expect a well defined answer.
Since the arguments for 0 and 1/3 are equally convincing, the only valid conclusion is that the problem is ill-posed and meaningless. That’s my intuition too. It makes no sense to talk of probabilities over an infinite measure space. I am not a probabilist, but that much seems clear.
On the other hand the question is meaningful over any finite interval and by symmetry, since one of them is between the other two, the answer has to be 1/3. But there can be no warrant to go from there to the infinite case. Things are different at infinity!
I am reminded of a claim by Alan Guth in his popular book in the early 80s on inflation at the big bang. The book sounded good, but he arrogantly included an appendix on Newton’s mistake. Newton’s mistake, according to Guth, was in arguing that an infinite universe would be stable because it has no center to collapse to. Well, you had best be careful if you believe you can outthink Newton and Guth wasn’t. He argued that because any finite universe would collapse under its own gravity (and in a time that was independent of size but depended only on density), then so would an infinite universe. This is just plain silly. Infinity is really really different and Newton’s argument is irrefutable.
The argument for 1 in the OP is:
For any two finite numbers A and B, the chance that they are as close together as |A-B| is zero. So this argument only applies to a set of measure 0. Because of this, I don’t see it as a valid argument that the answer is 0. An argument that only applies to a subset of the space we’re interested in can’t be taken to apply to the whole. So I only find the OP’s argument for 3 to be convincing.
I like Ultrafilter’s approach. In his division, the six possibilities don’t just have the same volume, they all have the same shape. The six pieces are congruent wedges, which come together at the X=Y=Z line. Looking down along that line, the six wedges each span 60 degrees.
Indistinguishable uses shearing transformations of Ultrafilter’s pieces to produce arguments for the other two cases, arguing that the transformations are volume preserving. The volumes are all infinite, though, so it’s not obvious (to me, anyway) that volume-preserving is necessarily a sufficient condition.
For any two finite numbers A and B, the chance that |A - B| is finite is one.
This doesn’t contradict what I wrote.
But argument 1 relies on what I wrote, not on what you wrote.
ZenBeam, the OP’s argument for 1) is like this:
Step 1: P(C is between A and B) = the sum* over all a and b of P(C is between A and B | A = a and B = b) * P(A = a and B = b)
Step 2: For all a and b, P(C is between A and B | A = a and B = b) = 0 [as (a, b) is a finite interval within the infinite line of possible C values]
Step 3: Therefore, each term on the right-hand side in step 1 is zero, and thus the sum itself is zero, establishing that P(C is between A and B) = 0.
So there is ultimately no restriction to those A and B a particular distance from each other, even though step 2 happens to involve marginal distributions of that form.
[*: Well, the sum may need to be reinterpreted as some kind of integral of probability densities, but this is the basic idea of the argument nonetheless]
Let me put it this way.
The probability density function for a variable distributed uniformly over [a,b] is defined: pdf(x) = 1/(b-a) for a<=x<=b and pdf(x) = 0 otherwise. Note that the integral of pdf(x) taken from negative infinity to infinity is 1.
Now if we let b go to infinity or a to negative infinity (or both) we see that pdf(x) = 0 over the entire real line. However this leads to the integral of pdf(x) taken from negative infinity to infinity to be 0. Thus pdf(x) in this case fails to be a probability density function.
This means that we can’t have a uniformly distributed variable over the entire real line since its probability density function can not exist.
Thus any statement of the form, “If A, B, C are uniformly distributed reals, then the probability that A < B < C is X,” is vacuously true (If P then Q is vacuously true when P is false). So it’s no wonder that convincing arguments exist for different values of X.
Right, in that particular standardized sense of what a uniform distribution is, you cannot have one on the infinite line, which has been noted from the beginning. But I think it’s interesting to consider what alternative sense one might make of the notion of uniform distribution regardless (for there is nonetheless an intuition here which can be formalized in alternative ways).
To reiterate the example I gave before, you cannot have a uniform distribution on the natural numbers; each natural number would have to have infinitesimal probability, the only infinitesimal real number is 0, and countable additivity tells us we cannot have every individual number have probability 0 as their probabilities need to sum to 1.
Yet, nonetheless, there are various natural and useful alternative senses in which one might and indeed people do speak of “probabilities” for “uniformly random” natural numbers; e.g., very often this use is made of the asymptotic density. So that one might say a random natural number has probability 50% of being even, probability 0% of being a square, probability 6/π^2 of being square-free, an undefined probability of containing an even number of digits, etc.
We all recognize that this is not a probability distribution in the standard, Kolmogorov sense, but it is still an interesting and useful notion which maps onto the pre-formal, ordinary language notions of probability, uniform randomness, etc.
Wanting to use the language of probability for these alternative notions is very natural and defensible; it’s not just a blunder made by the mathematically naive. Statements like “The probability that two random positive integers are coprime is 6/π^2” are readily found made by even professional mathematicians, though of course they will be quick to explain what they mean by it.
So, yes, if we toss together all the intuitions we might want for what the properties of a uniform distribution should be, we’ll find they contradict each other; they can’t all coexist. The OP’s own arguments demonstrate this. But I don’t think this means we shouldn’t examine the arguments any further or think about how we might make formal sense of them in contexts validating some and rejecting other intuitions. That’s what mathematics ought to be; exploring (and comparing and contrasting) logical possibilities, not ritualistic adherence to nominally “one-size-fits-all” standards set by caprices of history. Better to engage the arguments of the OP on their own terms than to smash them into a Procrustean bed where they don’t belong, and then, observing the damage, discard them as broken.
I disagree. Let’s modify Indistinguishable’s arguments in the post following yours:
Step 1z: P(C is any value) = the sum* over all a and b of P(C is between A and B | A = a and B = b) * P(A = a and B = b)
Step 2z: For all a and b, P(A = a and B = b) = 0 [true even if you look at the probability of A, B the range (a,b)]
Step 3z: Therefore, each term on the right-hand side in step 1 is zero, and thus the sum itself is zero, establishing that P(C is any value) = 0.
This is the basic problem with the argument for case 1, and it has nothing to do with what the value of C actually is.
To be clear, I’m not arguing that 3 is correct. I’m arguing (in the part of my post you quoted) that the case 1 argument is not valid, and (in the other part) that if any value can be assigned, the strongest case can be made for 3.
I don’t disagree with your or Indistinguishable’s last posts.
ZenBeam, I’m not really sure what you’re trying to say here. Can you point out a specific logical inconsistency in Indistinguishable’s post 28. You seem to be calculating a different probability altogether.
The point of my previous post is that arguments 1, 2, and 3 can be made equally strong owing to a false hypothesis. Saying that any one of them is the strongest is simply incorrect.
In 2) the obvious statement that probability of A>C equals 1/2, seems like the same reasoning that one would use to make the obvious statement that probability C is between both A and B is 1/3. If you pick 3 numbers from the same distribution (and as in a continuous distribution, the probability any of them being equal is zero), then each one has the same chance of being lowest, middle, or highest: 1/3.
Anyway, 1) is wrong because expected value of |A-B| is infinite, not some finite number.
- is wrong because they are not independent events. Given that A<C, then it is less likely than 1/2 that B>C because C is the larger of two numbers taken from the distribution, not a single number taken from the distribution.
The way I would define the notion of sampling from a uniform infinite distribution is the same way infinity is usually dealt with: take the limit as something goes to infinity, don’t treat infinity like a particular number. Make the distribution be a uniform distribution from -L to L (probability density function =1/(2L) ), do the calculation, then take limit as L goes to infinity.
Using the approach of 1), expected value of |A-B| ends up being 2L/3. Divide by 2L to get probability of C falling within that range to get 1/3. Limit of 1/3 as L goes to infinity is 1/3.
You could also think of picking 3 random points on a line in the range of -L to +L. Probability of C being the middle one is 1/3. No matter how long you stretch the line (increase L), the order of those points stays the same, probability of C being in the middle is 1/3.
In Indistinguishable’s post, he calculates the probability of C between A and B, and obtains 0. I follow the same approach, and calculate the probability of C having any value, and also obtain 0. The answer should be 1. Thus, there’s a flaw in that approach, probably that the problem is ill-posed, but it doesn’t matter. The take-away is that it isn’t a valid argument for the probability being 0.
Ultrafilter’s approach shows, by its symmetry, that if there is an answer, it must be 1/3. In the same sense that, if only a single number were picked, and you asked the probability of it being positive, if there is an answer, it must be 1/2.
Why is this the probability that that C takes on any value? It seems to be the probability that C is between A and B.
This is only one way to extend a subset of the reals to the entire real line.
Consider, S_n, the infinite union of of intervals [z, z + (n-1)/n] for some natural number n and all integers z. Using argument 1 one can show that the probability of choosing A, B, and C uniformly from this set has zero probability of resulting in C between A and B for all n and the limit of S_n as n goes to infinity is the entire real line.
And if we have already picked A the probability of choosing a number greater than (or less than) A must be 1/2. And if we have already picked B the probability of choosing a number greater than (or less than) B must be 1/2.
Thus the probability of picking a number between A and B must be 1 - the probability of picking a number less Min(A,B) - the probability of picking a number greater than Max(A,B) = 1 - 1/2 - 1/2 = 0.
Sorry, missed one change after the cut and paste. The three steps should read:
Step 1z: P(C is any value) = the sum* over all a and b of P(C is any value | A = a and B = b) * P(A = a and B = b)
Step 2z: For all a and b, P(A = a and B = b) = 0 [true even if you look at the probability of A, B the range (a,b) (that is, to expand on this a bit, P((min(a,b) <= A <= max(a,b)) & (min(a,b) <= B <= max(a,b))) = 0]
Step 3z: Therefore, each term on the right-hand side in step 1 is zero, and thus the sum itself is zero, establishing that P(C is any value) = 0.
Sure, for fixed A and B. If you’re considering picking A and B “uniformly” from the range +/- infinity (somehow), this argument applies to a subset of that of measure 0.