I’ll have to pick that up. My understanding of infinitesimal calculus comes only from what I’ve read on the internet.
I don’t think this matters. The average student uses the reals all the time but no one but math majors ever even encounter a Dedekind cut. Same goes for set theory and Zermelo-Fraenkel axioms. At some point, we just have to accept that the structures we’re using have been proven to be consistent.
There are two different things here, and I have probably not been clear about the motivations for them.
First: use infinitesimals to teach calculus. I think this is a more natural approach in general, but the benefit here specifically is that we can treat dx and dy as actual variables that we can manipulate like any other, as compared to the usual approach where dy/dx is treated as a unit blob, except sometimes not really.
Second: start with the implicit representation of functions. That means, when trying to find the slope of a curve, of whatever kind, that we add a dy to every place that y shows up and a dx to every place that x shows up. That gives us a new, nearby point on the curve. We can use that equation and the first one, and just a little algebra, to perform the derivative.
One can use these fundamentals to derive all the normal rules–power rule, product rule, chain rule, etc. From there, one can move on to more special cases.
How do you know that nearby point is on the curve?
You (and Hari) are arguing that infinitesimals make calculus easier. But I’d want to break it down: what, exactly, do they make easier?
Do they make it easier to logically justify and rigorously prove those rules?
Do they make it easier to informally derive the rules?
Do they make it easier to intuitively understand the rules and how or why they work?
Do they make it easier to use the rules, once you have them?
I don’t–at least not initially. One thing I may not have been clear about is that the initial substitution could have be anything, because I’m just stating the same fact in a different way:
y = x[sup]2[/sup] [(x, y) is a point on the curve]
y+dy = (x+dx)[sup]2[/sup] [(x+dx, y+dy) is a point on the curve]
q = t[sup]2[/sup] [(t, q) is a point on the curve]
batman = superman[sup]2[/sup] [(superman, batman) is a point on the curve]
However, I chose (x+dx, y+dy) because I knew it would make things easier later on. And that eventually, I would have to assume that dx and dy were small. But the smallness assumptions come in specific, obvious places: throwing out high powers of them, or assumptions like sin(dx)~=x, etc.
In order:
Not exactly, because as Hari Seldon said, proving the consistency of infinitesimals requires some serious math. But then so does justifying the existence of the reals or the consistency of set theory. So we aren’t in new waters here.
Yes, because we can manipulate infinitesimals just like anything else and aren’t burdened with the excess notation of limits.
Yes, because we aren’t burdened with weird special rules like “dy/dx looks like a fraction but really isn’t”. With infinitesimals, it is. There are no special cases aside from those we already know of (no dividing by zero, etc.)
Again, yes, for more or less the same reason–there are fewer restrictions and thus more avenues to approach any given problem.
Does it? Take a point on a sphere:
x[sup]2[/sup] + y[sup]2[/sup] + z[sup]2[/sup] = 1
Suppose we want to find dz/dy. We’ll make our new point (x, y+dy, z+dz), because we know we’re holding x constant. Go through the steps I outlined above and you get:
2ydy + 2zdz = 0 [the x components drop out in the subtraction]
And with a little manipulation you get:
dz/dy = -y/z
Easy enough. And even easier for functions that are usually written non-implicitly, like z=x[sup]2[/sup]+y[sup]2[/sup].
It’s a wonderful thing about cultural evolution and half of pedagogy. The other half is knowing who you’re teaching to.
“You get used to it, though. Your brain does the translating. I don’t even see the code. All I see is blonde, brunette, redhead.”
Look at how ONI taught you. At first, you stumbled your way through the game, coming up with jury-rigged, ad hoc solutions to simple, basic problems. After solving many basic problems and becoming acquainted with the game, it started to dawn on you how you might solve them better with more complex and efficient solutions; Now you can skip the basic stuff and go straight to the best way of doing things. But if you tried getting a new player to do that, it would likely overwhelm them. The military likes to conceive teaching in terms of crawl, walk, run and when you know how to run, it can be difficult to see why anyone would waste their time crawling because it’s currently useless to those who can run.
It’s possible that the way you propose is better. I don’t have a reason to doubt that it’s a superior way to talk about math among people who already know a fair amount of it. I’m just not sure it’s a good way to teach people who know little math. If you have cousins handy, you could try it out on them if you’re ok with becoming their least favorite uncle for a while.
Have you noticed something similar when it comes to other skills like programming? Can your total understanding change more than once if you go deep enough into a skill?
Oops–I answered the wrong question above, though perhaps what I did answer is still satisfying. The point is that it just follows from how we’re defining the curve. The whole basis for the implicit form is that we’re stating a S(x, y) = 0 where (x, y) is a point on the curve. If that’s true, then (x+dx, y+dy) is a point on the curve when S(x+dx, y+dy) = 0. It’s just basic substitution.
I don’t assume, initially, that dx and dy are small. That comes at some later step where I have to throw out a dx[sup]2[/sup] or the like.
I’m actually really looking forward to teaching my nephews math (though it’s some years away). I’m certain my sister will send them to me. My worry is that I will find a way of teaching that is intuitive and natural, but doesn’t correspond easily to how they learn things in school, and will make things harder initially.
Yes, though seemingly harder to pin down because it’s harder to think of specific examples, or even where things went wrong before. However, looking at early code I wrote, I often slap my forehead and wonder what I was thinking. It’s not even that the code was bad, per-se, and not that I was an idiot. But somehow I didn’t grasp the essential element of things and made a much more complex solution that was required.
The ability to figure out what’s essential when discovering unfamiliar information is a useful skill; It’s picking out patterns buried in the noise.
When looking at early code you wrote and seeing how your code could have been more elegant, how would you teach your previous self to do it better in a way that your previous self would understand with the knowledge and skills he had at the time?
I don’t. The 3Blue1Brown video I linked to in the OP gets part of the way there. I don’t think he ever mentions the word “infinitesimal”, though with his zooming in on the curve and treatment of dx and dy as ordinary variables, it’s clear that’s the approach he’s using. But the dS notation and bringing up a case with derivatives with respect to time feels like a distraction.
It’s possible that the book Hari mentioned above does so. There appears to be an online copy here. I haven’t yet looked at it.
I wish I knew! This aspect of programming isn’t taught at all in CS courses. People are just expected to eventually learn how to program elegantly. Maybe the coursework needs to be graded on beauty and not just correctness :).
Something for you to analyze and figure out. Having to teach someone (even a hypothetical someone) is a great way to make explicit, formalize and consolidate what you know. Your understanding might deepen further if you do.
Zachtronics games like SpaceChem and Opus Magnum give people practice at doing that with a very basic kind of visual scripting. Door Kickers might be a topdown version of SWAT 4 but it can also have that same economical elegance where everything flows smoothly and comes together neatly based on a small set of instructions; A level/problem can go from seeming impossible to being completed in very little time with no wasted actions.
Maybe the reason it isn’t taught in CS classes is because it doesn’t just relate to programming or games, it’s a general skill that can be applied to pretty much anything, even pretty pedestrian stuff: I once had a very unsatisfying job where I had to fold cardboard boxes which was as boring as you would expect. So, I made it a little less boring by figuring out the most efficient way to fold them with a minimum amount of movement and waiting. I did the same for loading rifle rounds in magazines when that was part of a test. It didn’t require much cleverness, mainly the willingness to pause, observe, reflect and experiment to figure out a better way.
It might be suitable for more advanced CS classes (like the equivalent of an MBA) or competitions. Are there jams where people are given a goal and whoever can do it in the fewest characters wins?
Code Golf. They’re fun, but the results are largely unreadable. They’re not really elegant, just short. Though some of the techniques they use can be elegant.
I like Zachtronics games, too. And household optimization!
As an example of the non-intuitiveness I mean, what’s (dx/dy) * (dy/dz) * (dz/dx)? Why, obviously, all of the infinitesimals cancel, and so it’s just 1. So what’s (∂x/∂y)(∂y/∂z)(∂z/∂x)? Why, obviously, that’s… negative 1? Where’d that negative come from? The key, of course, is that while dx/dy can be broken down to something called dx divided by something called dy, ∂x/∂y (despite looking very similar) is a symbol in itself, not something called ∂x divided by something called ∂y.
Sure it is, though it may not be recognizable. Ok, I’ll do a more normal one. The function is:
f(x, y) = x[sup]2[/sup]y + y[sup]3[/sup]
The partial derivative with respect to y is:
∂/∂y f(x, y) = x[sup]2[/sup] + 3y[sup]2[/sup]
We can do it using the implicit technique by defining:
z = x[sup]2[/sup]y + y[sup]3[/sup]
We’re interested in dz/dy, so we’ll use (x, y+dy, z+dz) as the new point, holding x constant:
z+dz = x[sup]2/sup + (y+dy)[sup]3[/sup]
Subtract the original equation and do some algebra:
dz = x[sup]2[/sup]dy + 3y[sup]2[/sup]dy
dz/dy = x[sup]2[/sup] + 3y[sup]2[/sup]
Same thing. And sure, ∂ vs. d can get confusing, though I think maybe that’s another argument in favor of infinitesimals, not against. The infinitesimals always work right if you’re explicit about them.
Differential equations is different, and much more difficult IMHO. This stuff is basically Calculus I, or maybe somewhere in AP Calc AB. I was never very good at DiffEq.