In the course of my calculus studies I’ve come across several concepts that look perfectly reasonable to my unenlightened eyes, but which the book says is actually an “abuse of notation”. The first example I can think of is the chain rule for derivatives:
dy/du * du/dx = dy/dx.
Since the du’s cancel, this looks A- okay to me. Furthermore, I’ve seen the symbolic representations of the divergence and curl (del dot and del cross) referred to as “helpful abuses of notation”. And yet is the divergence not the sum of the partial derivatives, just as the dot product with the del operator implies?
My question is simply, what am I missing? Apparently the rigor of modern mathematics has rendered these notations wrong in some sense, even though they are helpful to the beginning student. But I don’t like having to learn things just to unlearn them later, so can somebody please fix my understanding of these concepts?
P.S. If you can think of any common “abuses of notation” that I am forgetting, please pipe up and enlighten us all.
By “abuse of notation”, it’s meant that the notation could convey something to the reader that is not true. In the chain rule example, something like dy/dx = dy/dudu/dx could lead someone to think that division and cancellation (of the du’s) are really occurring when they’re not. dy/dx is not really a ratio of two numbers, but a limit of a difference quotient. As a mnemonic device, the notation works well for conveying a computational rule in a simple manner, but it belies the underlying complexity and the rigorous math to prove it. Personally, I never use the expression “abuse of notation”. I usually referred to the dy/dx = dy/dudu/dx form of the chain rule as symbolic cancellation and a great mnemonic device.
I’ll do the dot product and divergence explicitly, as the cross product is pretty much the same but more complicated. First the dot product:
(a,b,c) * (d,e,f) = ad+be+cf
Fine, but what does it mean? Each term “ad”, “be”, “cf” is a multiplication of two numbers. Now the divergence, using the notation you’re talking about:
Now when we set “d/dx” next to “P” we don’t multiply the two. We apply the first as an operator to the second. The notations for multiplication and operator application both consist of adjoining the two parts, so we can get away with it as a way to teach the students (about 2/3 of them) who aren’t really going to get beyond “how to use the tool”. To see why this notation is “abusive”, try taking it seriously and calculating
(P,Q,R) * (d/dx,d/dy,d/dz)
It’s nonsense, but it should make sense if this were to be a real dot product.
Now for the chain rule. You’re thinking like Leibniz was when he invented that notation, and like Newton was when he worked with literal infinitesimals. Unfortunately, infinitesimals are really logically buggy, and only within the last century can we deal with them as such within the field of “nonstandard analysis”. Calculus as it’s taught is all based on limits since Cauchy, Weierstrass, and so on shored up the foundations of real analysis. That is, dy/du is not a ratio of two infinitesimals, but a limit of ratios of finite quantities, both tending to zero. It can’t be thought of as putting two things together in a certain way like a fraction can.
You’re right that the du terms “should cancel”, so the chain rule reads
dy/du du/dx = dy/dx
But at this point that’s just a nice mnemonic hook rather than a real algebraic manipulation. We’re abusing the notation. Really this statement requires proof in terms of the real definition of a derivative.
To see where this one goes wrong, look at the multivariable chain rule. Let y be a function of u and v, and both of those be functions of x.
dy/dx = dy/du du/dx + dy/dv dv/dx
which now makes no algebraic sense at all if these are to be thought of as “fractions” (which the notation clearly suggests).
Of course for this example there are even better notations for derivatives that come along much later in the game and explain everything better, but for the purposes of teaching calculus we just the the suggestive – if strictly-speaking misleading – notation.
I don’t know what you do by vocation (opp. a well-known 'Doper by avocation), but I don’t know a mathematician who doesn’t use this term, especially in seminar talks and the like. Often the full complement of bells and whistles that we should put on an expression would take far too long to write out, and we drop parts of it or rewrite the thing totally in some mnemonically suggestive notation. Since the plain reading of the text is no longer rigorous, we must include a caveat at some point to say “remember that when I say this I mean that”. The standard phrase is “by abuse of notation”.
Incidentally, while the phrase was originally an eggcorn, it’s now a common back-formation to say that a particularly egregious example of such a misleading style is “abusive notation”, though whether the object of such abuse is supposed to be the rigor or the reader’s brain is left unsaid.
I’ve always thought this notation was abusive in the extreme. I’m not a mathematician but am a physicist and use these things sometimes - at this point I’m not very proficient with calculus.
But I remember learning calculus with these notations and being badly thrown, because I thought I knew how fractions worked and it was incompatible with what they were telling me now. Years later I learned Mathematica and Maple and other software packages, which generally choose a functional notation closer to f’(x) or dx(f), and which didn’t seem to create any logical or compatibility problems. I think if I had learned a consistent notation such as the functional ones, instead of spinning my wheels so long against the seeming (and in fact real) paradox of the traditional notation, I’d be doing much more calculus today.
To illustrate just how bad the problem is, suppose we’re using partial derivatives, instead of total ones (I’m not sure how to make a partial-derivative symbol; it’s that curly-d thing that looks like a backwards 6). Suppose we had dx/dy * dy/dz * dz/dx , with all of those 'd’s being partials. Naïvely, one might expect to be able to cancel everything, and get 1. But actually, dx/dy * dy/dz * dz/dx = -1.
Another example: Using total derivatives, dx/dy = 1/(dy/dx), except in cases where one of those is 0, just like one would naïvely expect. Not so with partial derivatives: If x and y are unrelated, then you’ll get both dy/dx = 0 and dx/dy = 0.
One can often get away with abuse of notation in mathematics, since in many cases, the abuse will still get you the right answer, and in these cases the notation can be a useful guide for one’s intuition. But it’s very important to be aware that you are abusing the notation, since sometimes, these tricks break down, and it’s important to be aware when that happens.