Yet another question about infinity and Aleph numbers (Will they ever end?)

On another message board, someone asked: “If there is an infinite set of numbers, and you pick random numbers from that set for an infinite amount of time, will you ever pick the same number twice?”

For the sake of argument, let’s say that we limit ourselves to rational numbers, and we have a computer that can generate a truly random rational number with infinite precision and compare it to all the previously picked numbers in a finite amount of time. (Yes, it’s a magic computer.)

I said no. Here’s why:

The number of chances we get to make a match is equal to the set of all pairs of rational numbers. I think it would equal [sub]lim n -> infinity [/sub] (n(n-1))/2. Would this equal Aleph-null?

The chance of any given pair being a match would be [sub]lim m -> infinity [/sub] 1/2^m. Aleph-one, right?

So the total probability would be (Aleph-null/Aleph-one) or zero.

But I’m not a mathematician, so I’d appreciate it if somebody would tell me if I am right, wrong, or talking total nonsense.

An infinite number of selections? There would be a remarkably high probability of picking a duplicate number. Here’s my reasoning:

The birthday problem shows us that in 365 possibilities, you only need a couple dozen selections to have a pretty good chance at getting a duplicate. That’s a good chance of duplicates when the choices outnumber the selections 15 to 1.

I guess my question is how big is the infinity of selections compared to the infinity of choices?

You are mixing infinities (actually definitions of infinities). The infinity in calculus is a formal symbol. It is not a number, and it is not identified with any aleph. As to the limits in your post: lim[sub] n -> infinity [/sub] (n(n-1))/2 is infinity (the infinity of calculus which is not a number). And lim[sub] m -> infinity [/sub] 1/2^m is zero.

In transfinite cardinal arithmetic, division is not defined. So Aleph-null/Aleph-one is not defined.

As to your question, I don’t know the answer.

A few thoughts:

First, if we’re going to talk about the random picking of a rational number, we need a probability model for that–in other words, we need to agree on things like, “What’s the probability of picking the number 1, or the number 2, or the number 3/5, or the probability of actually picking an integer from the set of rational numbers.” In other words, we need to agree on what our probability function is–which sets get assigned which probabilities.

I’m assuming then when you speak of “generate a truly random rational number”, you would like for the probability distribution to be uniform, i.e., no number is preferred over any other number. Well, right there we got a problem. Unfortunately, there is no uniform probability distribution over a countable set (such as the rational numbers). There may be (though I’m not sure) such a probability distribution if we involve nonstandard probability (where we allow infinitesimal probabilities), but that’s not really something I’m familiar with, so I can’t help you along those lines.

Anyway, getting back to the question, and making sense of it the best I can, first, there’s a significant amount of vagueness in stating that we’ll pick infinitely many members from an infinite set. How big is infinite (in either case)? Are we going to pick aleph-naught members? Aleph-one? And how big is the set? Is it aleph-naught? Aleph-one? Aleph-4543?

Since you restricted our attention to the rational numbers, the latter question is answered–this set has cardinality aleph-naught. But the former question remains. Of course, if we were to pick aleph-one (or more) rationals at random, there’s no doubt as to the answer–it is certain that a rational number must be repeated (pigeon hole principle, since there’s only aleph-naught rationals).

So the only interesting possibility left is if we are picking aleph-naught rationals at random.

It’s true that there are aleph-naught pairs of rationals (since the product of two countable sets is still countable). However, about your statement, “The chance of any given pair being a match would be lim m -> infinity 1/2^m”; I must admit I’m having difficulty in seeing the reasoning behind this. Additionaly, that limit would be zero, not aleph-one (and if, as I expect, you were referring to the denominator as approaching aleph-one, I would disagree with that as well–if we’re talking about cardinalities, and the limit as m -> aleph-naught, the denominator would approach aleph-naught, not aleph-one).

Anyway, I hate to leave the original question unanswered, but, because of the problems I mentioned earlier about the impossibility of a uniform probability distribution over a countable set, I really can’t see any way to make the question answerable under (standard, anyway) probability theory.

An alternative approach to what I’ve already mentioned would be to consider the set of all sequences of rational numbers–such a sequence would correspond to picking aleph-naught rationals, one at a time, in order (the first term in the sequence corresponds to the first rational picked, and so on). We could place a probability distribution on this set–again, however, we would first need to agree on which probability distribution to use, since there are many. A natural choice would be to construct our probability distribution in some way analogously to Lebesgue measure, but it’s not clear to me offhand how the details would work out in that case.

Anway, having said all of that, my gut feeling is that (using some natural approach along the lines mentioned in the previous paragraph) we would get that a rational would be repeated with probability one. The reason I say this is that it seems like the “density”, in some sense, of sequences with repeats would be much greater than the “density” of sequences with no repeats.

I was thinking that we could represent each number as an infinite string of binary digits. To compare two numbers, we compare corresponding digits of those numbers. The chance that the first two digits match is 1/2. The chance that the first and second digits match is 1/2 * 1/2. The chance that the first n digits match is (1/2)^n.

But much simpler if you know that the product of two aleph-naught sets is also aleph-naught.

Yes I was talking about the denominator approaching aleph-one. And I thought it was aleph-one because the Wikipedia entry on aleph numbers led me to believe that 2^(aleph-naught) = aleph-one.

But then, I have to admit I got lost when they were talking about the Axiom of Choice and the Continuim Hypothesis. Also, I thought that Aleph-naught was equal to infinity.

As for the original question, if we pick aleph-one numbers we are guaranteed to get a match due to pigeonholing. (We had to start repeating numbers because we picked them all - all infinity of 'em.)

If we can only pick aleph-naught numbers, then the question can’t be answered, because “there is no uniform probability distribution over a countable set.”

Standard probability distributions don’t allow infintesimal probabilities? It seems like an infintesimal probability is exactly what we need.

I realize that I’m mostly restating what everyone else has said. I’m just trying to see if I got it right and wrap my head around these ideas.

Well, “infinity” is a vague term by itself, in particular when talking about the cardinalities of sets. A set could be said to be infinite if it’s cardinality is at least aleph-naught, but that doesn’t exclude the possibility that it’s cardinality could be much much greater than aleph-naught.

Well, that equation is true if and only if the continuum hypothesis is true.

What it boils down to is that aleph-naught is the smallest transfinite cardinal. We can also talk about 2^(aleph-naught), which is strictly larger than aleph-naught. Cantor believed that it was actually equal to aleph-one (the next smallest transfinite cardinal) but was unable to prove it–the conjecture that it was true came to be known as the continuum hypothesis (CH). Eventually, Goedel’s and Cohen’s work demonstrated that CH is independent of the standard (ZFC) set theory axioms–CH can be neither proven nor disproven in ZFC set theory. It’s consistent with the rest of set theory to assume that 2^(aleph-naught) = aleph-one, but it’s also consistent to assume that 2^(aleph-naught) is equal to (almost) any larger cardinal as well. We just don’t know.

However, the expression lim (as n -> aleph-naught) 2^n is another thing entirely. 2^n is finite so long as n is finite, and so this limit is actually aleph-naught itself–the limit can’t equal aleph-one, whether CH is true or not.

As for the axiom of choice, that’s necessary to really develop the notion of cardinality in the first place. Only sets that can be well ordered can be assigned a cardinality, and not all sets can be well ordered if you don’t have the axiom of choice. With the axiom of choice, all sets can be well ordered, and so all sets can be assigned a cardinality, so the axiom of choice makes things a lot nicer to work with.

Yeah, that’s right. By definition, a probability function maps into the real interval [0,1], which contains no infinitesimals. A nonstandard probability function avoids this by mapping into a hyperreal interval [0,1], or some other extension of real numbers which includes infinitesimals.

One more thing I forgot to comment on:

What you’ve done here, more or less, is demonstrate that there are 2^aleph-naught real numbers. Again, however, I want to point out that 2^aleph-naught is definitely not the same as lim (n -> aleph-naught) 2^n, so it’s not enough to just take the limit as n -> aleph-naught here. The actual definition of the cardinal 2^aleph-naught is that it’s the cardinality of the set of all functions from aleph-naught to 2 (2 is the set {0,1}). Alternately, it may be easier to think of this as the cardinality of all binary sequences.

However, this isn’t really appropriate to our problem. Remember, we’re talking about the probability of two rational numbers being the same. The above would apply, more or less, if we were talking about the probability of two real numbers being the same, but, remember, an arbitrary binary expression will not necessarily represent a rational number; to fix this, we’d have to alter the argument to limit the binary expressions to only those representing rational numbers.

Cabbage’s post are really and truly great. Nice job. I just want to give a little simplification to the OP that might make things easier to see.

First, let’s just look at the positive integers only. Same cardinality as the rationals. Start off with a very simple probability distribution of picking a positive integer. 1 gets picked 1/2 the time, 2: 1/4 of the time, 3: 1/8 of the time, etc. n gets picked 1/2[sup]n+1[/sup] of the time. Note that the probabilities are easily seen to add up to one. (It’s also practical to generate. Flip a coin until you get a heads. The number of flips is n.)

Start picking numbers using this distribution. On average, every other number will be 1, every fourth will be 2, etc. So there will definitely be duplicates. A lot of duplicates. In fact, take any number n, it will appear about 1/2[sup]n+1[/sup] times. So even if n is a “jillion” it will appear many times in a long enough sequence of choices.

For any distribution where even a single number has a non-zero change of being selected, that number will appear over and over and over.

Now, some people like to think in terms of distributions where all numbers are equally likely, so that in fact no number has a non-zero chance of being picked. I have never been happy with such statements. The sum over all choices of the odds of each choice has to add up to 1. How are you going to define this???

So I generally try to get people who ask questions like the OP to answer a simple question: What are the odds of picking 2? (Or whatever is suitable for their range.) No one has ever reasonably answered this basic question.

Oops, the Computer Geek in me got loose. We start counting with zero. So that’s 1/2[sup]n[/sup] for the odds.

Cabbage: Given that we cannot definitively state that 2[sup]aleph[sub]0[/sub][/sup] = aleph[sub]1[/sub], is there some standard notation that we can use? Say, define bet[sub]1[/sub] =2[sup]aleph[sub]0[/sub][/sup], and for n > 1, bet[sub]n[/sub] = 2[sup]bet[sub]n-1[/sub][/sup], maybe?

As for infinitesimal probabilities, isn’t it possible, in standard probability, to have a uniform distribution over the real numbers in a bounded interval? We might, for instance, generate a sequence of binary digits by flipping a coin a countably-infinite number of times, and thereby select a random number in [0,1].

A related question:

What’s the cardniality of the primes?

I suspect aleph[sub]0[/sub], as I worked out that that about 29.5% of the natural numbers are prime, but I’m probably wrong*
*It’s a guess from working out how many numbers from the set {x| 2<x<i and x is a natural number} are primes by taking 1 minus the product of the sum of 1/p[sub]i[/sub] working it oput for some i then taking a known simalir infinite product sum that I know to be greater and therfore finding an upper and lower bound for this value.

Several different points raised. First, in most models (it is correct that one needs a model to even begin) the probability of picking any given number will be 0. There are indeed models in which that is false; I will return to this point. Anyway, assuming the odds on any given number is 0, then the odds on repeating any given number are also 0. Since at any stage, you have picked only finitely many numbers, then the odds that you will repeat any of them is also 0. Nonetheless, this adds up to something less than a proof that the odds of repeating a number are 0, but I believe that is likely the case. BTW, in an infinite model, the probability of an event being 0 is not to say it is impossible, although that is certainly the case in a finite probability space.

Now supposing that the model is such that certain numbers have a non-zero probability of being picked, then, assuming the choices are independent, such numbers have a non-zero probability of being picked twice, and three time, and four and so on. And with infinitely many choices to be made the probability is 1 (which is short of meaning certain) that they will be picked once and twice and thrice and …, hence infinitely often.

The continuum hypothesis is a red herring here. Whatever 2^aleph_0 is, it is larger than aleph_0.

The birthday “paradox” (it is not a paradox) depends crucially on the fact that there are at most 366 possible birthdays and is made less surprising by the fact that in a set of 23 people, there are 12 x 23 = 276 pairs of people to compare.

It is Aleph-0, but the primes do not compose 29.5% of the natural numbers. (Note that what you and I mean by this is that the limit of the fraction of numbers less than N which are prime, as N goes to infinity, is 0.295. Talking about a fraction of an infinite set is not all that simple.)

In fact, I believe that there is no set of 100 consecutive natural numbers that contains 29 primes. So, I’m surprised you got such a high number.

Any infinite subset of the natural numbers has cardinality Aleph-0. This includes, for instance, the square numbers. The fraction of square numbers in the first N natural numbers is around sqrt(N)/N = 1/sqrt(N). So clearly, the squares make up a vanishingly small fraction of the naturals. The same is true with primes.

Surely you can say that half of all natural numbers are even?

Also I got it the wrong way round it was the sum of the product of 1/p[sub]i[/sub]

For example if i = 7

(1/2) + (1/2 * 1/3) + (1/2 * 1/3 *1/5) + (1/2 * 1/3 * 1/5 * 1/7) … > 0.7004

(1/2) + (1/2 * 1/3) + (1/2 * 1/3 *1/5) + (1/2 * 1/3 * 1/5 * 1/7) + (1/2 * 1/3 * 1/5 * 1/7 * 1/7) + (1/2 * 1/3 * 1/5 * 1/7 * 1/7 *1/7)… = 1/30 * (the power series of 1/7[sup]i[/sup]) + (1/2) + (1/2 * 1/3) + (1/2 * 1/3 *1/5) = 7/10 + 1/180 < 7.06

Therefore the fraction of the natural numbers that are not prime must be greater than 0.7004 but less than 0.706

I’m quite prepared to accept that I could be very wrong on this tho’.

The set of all infinite sequences of rationals is an uncountable set, so we can assign a uniform probability distribution to that. That may help answer the question in the OP.

I suspect that the probability of duplicates is 1, but I haven’t got a detailed argument.

C is generally used to refer to the cardinality of the continuum.

Okay, MC, I see where you’re coming from.

The fraction of natural numbers which are composite with smallest prime factor 2 must be 1/2.
The fraction of natural numbers which are composite with smallest prime factor 3 must be 1/(2×3).
The fraction of natural numbers which are composite with smallest prime factor 5 must be 1/(2×3×5).

The sum of these fractions converges pretty rapidly to 0.70523.

But this is not the fraction of natural numbers which are composite. That fraction is 1.

And I don’t know why not. I think that this fraction idea must just be a faulty way of looking at things.

Your problem is that you haven’t considered the terms like (1/2 * 1/5), (1/2 * 1/7), (1/5 * 1/257), etc.

In fact, the Prime Number Theorem states that there are about (N / ln N) primes less than N. (There are stronger statements available, but they aren’t too important for this conversation.) In particular, the primes do not form a nonzero fraction of the natural numbers, since ln N grows without bound.

Can you expand on that I can’t see why, infact as the sum of 1/p[sup]i[/sup] is divergent so if I considered these extra terms my sum wouldn’t converge to 1 it would diverge.

Consider Achernar’s three statements:

This is clearly fine.

Still good. These numbers start at 3[sup]2[/sup] = 9, and continue by adding 2×3: 9, 15, 21, … .

This is where the problems begin. These numbers start at 5[sup]2[/sup] = 25, but the next one is 35 = 25 + 2×5, not 55 = 25 + 2×3×5. After 35, we have to skip 45 (divisible by 3) but then we get 55 and 65, 85 and 95, and so on.

People who have done inclusion-exclusion before will recognize this pattern. The full series is
   [sum over sets of n primes p[sub]1[/sub],…,p[sub]n[/sub]] (-1)[sup]n-1[/sup] / (p[sub]1[/sub]×…×p[sub]n[/sub])
     = 1/2 + 1/3 + 1/5 + … - 1/(2×3) - 1/(2×5) - … - 1/(3×5) - … + 1/(2×3×5) + … .