Cantor infinities and zeros

I’ve expressed in this thread and elsewhere that I prefer the infinitesimal approach to calculus. I ran across this book a short while back:

It was written in 1914, well before Robinson put infinitesimals on a firm footing. It makes no real attempt to be rigorous, and makes its arguments in terms of “degrees of smallness,” which are equivalent to infinitesimals. Regardless, it makes things very easy to understand, uncluttered with excess notation about limits and such. It’s a surprisingly lighthearted take for 1914.

It also correctly notes that the relationship between the hyperreals and fleas has been long understood:

So, Nat’ralists observe, a Flea
Hath smaller Fleas that on him prey.
And these have smaller Fleas to bite ’em,
And so proceed ad infinitum.

That reminds me: in any calculus class, inevitably some student will raise his or her hand and ask, “what is dx”? Of course you have to explain that it represents a very small quantity, but that is not quite a mathematical answer so you have to continue with something like, “a covector” or “a differential form”, and explain all of that. So the notation is at once useful/suggestive but also possibly confusing to the uninitiated.

That was me! And I never did get a good answer. In fact I don’t think they said it was anything more than notation. I know that it was only much later that I discovered that \int was a long-S for “sum”. Or that dx \over dy could be treated as a fraction (at times).

It’s remotely possible that they gave me some advanced math mumbo-jumbo to shut me up and I’ve forgotten the details.

FWIW, I have to agree with @Dr.Strangelove. Limits always seemed to be a really hacky way of doing things. When Robinson lectured on infinitesimals, he made it clear that analysts generally made their discoveries using infinitesimals and then used limit arguments to write it up.

If this thread goes on much longer, I will post a description of genuine infinitesimals, ultrafilters and all. The standard part of an extended real number will emerge quite naturally. It will also be quite clear why any function on the ordinary reals extends to one on hyperreals.

Well, I for one would be appreciative and read it with interest!

I’m personally of the opinion that infinitesimals and numbers that can be made arbitrarily small in magnitude while retaining information are the same thing! The latter is just the formal definition of the former. Heck, I remember being told that epsilon and delta used Greek letters because those were used for infinitesimals.

As for the claim that either one doesn’t “exist in nature”: you could say the same thing about irrational numbers–at least, the ones that can’t be constructed. They just arise out of the concept of the real numbers and the operations we can do on them.

Heck, at some level, numbers aren’t really natural–they’re descriptions that we find useful.

You asked for it.


I have to start with ultrafilters. What I am about to describe is not the standard terminology. For one thing, the definition below is what is called a non-principle ultrafilter. An ultrafilter on the set \bf N of natural numbers \{1,2,3,\ldots\} is a way of dividing all the subsets into two classes. It will be convenient to call the two classes large and small, although this has little to do with size. These classes are required to satisfy the following hypotheses:

  1. Every set is either large or small (this is the “ultra” part).

  2. A set is large if and only if its complement is small.

  3. The intersection of two large sets is large. (It follows that the union of two small sets is small.)

  4. A set that contains a large set is large. (It follows that every subset of a small set is small.)

  5. Every finite set is small. It follows that every set with a finite complement is large. (This is what makes it non-principle.

The fact that (non-principle) ultrafilters exist requires the axiom of choice.

I will leave it to the reader to show that if \bf N is the union of three (or in fact any finite number of) mutually disjoint sets, say A\cup B\cup C, then one is large and the remainder are small. This is a consequence of 3. A decomposition of \bf N as a union of a finite number of mutually disjoint sets is called a partition of \bf N

An extended real number (ern) is an \bf N-indexed sequence, say u=(u_1,u_2,u_3,\ldots) of (ordinary) real numbers. If v=(v_1,v_2,v_3,\ldots) is another ern, the three sets \{n|u_n<v_n\}, \{n|u_n=v_n\}, \{n|u_n>v_n\} is a partition of \bf N. We say that u<v or u=v or u>v according to which of those sets is large. This illustrates the fact that only things that happen on a large set matter.

If r is an ordinary number, the corresponding ern is the constant sequence (r,r,r,\ldots) or anything equal to it.

Addition, subtraction, multiplication, and division (but not by 0) are term-wise. In particular, if v\neq0, then \{n|v_n\neq0\} is large and we define u/v=w by w_n=u_n/v_n when v_n\neq0 and w_n=0, when v_n=0. That last clause is irrelevant. You could make 1 or \pi or anything at all.

The ern u is finite if there is an m\in\bf N such that \{n\in N|u_n<m\} is large. Otherwise, u is infinite. If u, the two sets of rational numbers \{q|q<u\} and \{q|q>u\} constitutes a Dedekind cut and defines a real number called the ordinary part of u. The eros whose ordinary part is 0 are the infinitesimals. It is not hard to show that the reciprocals of infinitesimals are infinite and conversely.

If f is a real function and \{n|f(u_n) \ {\rm is\ defined}\} is large, we let f(u)=v where v_n=f(u_n) when that is defined and 0 otherwise.

Of course, there is a lot of routine work required to show, for example, that equals added to equals are equal and, in fact, the same is true for any function as described above.

The function is continuous at r provided for any infinitesimal h, f(r+h-f(r) is infinitesimal and the derivative at r is the ordinary part of \frac{(r+h)-f(r)}h, provided that ordinary part does not depend on h. For example, if f(x)=x^2, then \frac{(x+h)^2-x^2}h=2x+h and the ordinary part 2x does not depend on h.

I will have to read more later (slowly), but can you help me a bit on the intuition? The even and odd numbers are complements of each other. They can’t both be large, because their intersection is finite (and thus small). Nor can they both be small since one of them must be large if we have deemed the other small (by rule 2).

I guess there is nothing stopping us from arbitrarily declaring that, say, the evens are large while the odds are small. A different ultrafilter might choose the opposite. Is that right?

A set and its complement cannot both be “large” in this sense… but it does not matter which one you pick, for this application, as far as I understand it. What is important is that finite sets are “small”.

Reading on a bit further…

Consider the erns (1, 1, 1, 1, \dots), (2, 2, 2, 2, \dots), and (1, 2, 1, 2, \dots). Depending on our arbitrary selection of whether the evens or the odds are “large”, the last number would equal one or the other of the first two. Is that strange? The first two are definitely the real numbers 1.0 and 2.0, but the last depends on an arbitrary selection. I guess there’s nothing wrong with that, per se, but it seems to mean that there are an infinite number of constructions here, and I’m not sure they’re equivalent.

They are certainly not equivalent, because the different hyperreal fields you construct may be non-isomorphic, unless you assume something like the Continuum Hypothesis which will imply that they are all isomorphic (this is described in Chang and Keisler’s book on model theory).

Note: we may define, exactly as usual, dy = f'(x)\, dx, which is a function of the independent variables x and dx.

You may now prove facts about derivatives (Product Rule,Chain Rule, and so on) by substituting infinitesimal numbers for dx (so \Delta y/\Delta x will be infinitesimally close to dy/dx). However, there is (it seems?!) no magical new insight as to the geometrical meaning of the derivative as the slope of the tangent line to a curve.

Hmm, this thread seems to have morphed into a rather technical discussion about infinitesimals. Which is interesting in itself, but I have to say the original question was a bit of a side effect of something else I have been thinking about. So I think I will open a new discussion about effectiveness of mathematics (the famous Wigner quote)… all welcome to join in!

You are absolutely correct. There are ultrafilters where the odds are large and the evens small and others where they are reversed. Any given infinite set will be large in some and small in others.

Incidentally, the standard use of the word ultrafilter is to describe the set of large sets. The set of small sets is called a maximal ideal–in the Boolean algebra of all subsets. Ultrafilters are sometimes called maximal filters.

I’m afraid that any explanation of this is going to end up unintuitive, because

Everything that relies on the Axiom of Choice is always unintuitive.

I don’t think you meant that as a joke, but intuitionism is a philosophy and style of mathematics that explicitly denies AC. And I agree with your statement. But so much advanced mathematics goes down the drain if you deny. Most notably the intermediate value theorem that states that if a continuous function is negative at one place and positive at another, it has a zero somewhere in between. That does have an intuitionistically valid proof for polynomials, but not in general.

Wait, really? If so, that’s the only sensible thing I’ve ever heard of that comes from the Axiom of Choice. Meanwhile, it also leads to absurdities like Banach-Tarski, and the AoC itself is already profoundly counterintuitive: What does it mean to choose something, if you can’t say what it is that you chose?

I never thought Banach-Tarski was particularly surprising, let alone absurd.

We already knew that a 1-long line segment has exactly the same number of points as a 2-long segment, and you can turn one into the other via the simple transformation x’ = 2x.

Likewise, two unit balls have the same number of points as one. It’s neat that you can transform one into the other solely with rigid transformations, as opposed to scaling, but the end result doesn’t seem too shocking.

But what’s the volume of each of the pieces? It seems like it should be possible to unambiguously extend the notion of “volume”, such that any specified set can have its volume determined, and such that it maintains the usual properties of volume (e.g., that given two disjoint sets, the volume of their union is the sum of the volumes of the two sets, and that volume is preserved under rigid transformations). And the way that the Banach-Taski Theorem gets around that is that it doesn’t actually specify its sets, merely asserts, through the Axiom of Choice, that they exist.

I think fractals demonstrate that notions of area and volume start to break down–or at least become harder to define–when you have infinitely complicated shapes. I don’t see why it should be possible to assign a volume to the individual pieces.