If dy/dx is not a fraction, then why am I allowed to multiply both sides of an equation by dx?

I can’t recall ever getting a satisfactory explanation of this, but it’s been more than a decade since I took calculus.

I have the same question… Are we talking about initial value problems, like y’(x) = sqrt(abs(y(x)) * y(x)+1, solve for y(x)?

In most cases, it’s the Change of Variables Theorem that allows it.

To put it another way, in math you aren’t “allowed” to do anything without justifying it first. There is a justification for saying that the equation a/b = c (b != 0) is equivalent to the equation a = bc. And there is also a justification for saying that the integral of F(y(x))(dy/dx) over x is the same as the integral of F(y) over y, but the two justifications are not the same.

It is the limit of a fraction (delta-y/delta-x), which helps to justify the ways it’s treated like a fraction.

The elementary calculus texts I’ve seen make dy equal to *(dy/dx)dx by definition.

I was thinking more like basic differential equations, where you have something like dy/dx = x[sup]2[/sup] - 2, then multiply both sides by dx and solve by integration.

…Okay, now I’m confused… What are dy and dx again? Change in x and y axis over a time, right?

Thanks, I’ll have to read up on that and see how much I can actually make sense of.

This makes a lot of sense, actually. I remember it being pounded into my head that “dy/dx is an operator, not a fraction!” but thinking of it as the limit of Δx/Δy as ε -> 0 does bring back some fuzzy memories that make sense.

Because you aren’t “multiplying both sides by dx”, even though the end result appears as if that works. That application of the chain rule coincidentally gives the same apparent algebraic relationship (provided that all of the prerequisites for applying the chain rule are met) but it becomes immediately obvious when dealing with partial derivatives that this is on longer the case. So, this is like noting that 2+2 = 4 and 2x2 = 4 and concluding that addition and multiplication operations are the same.

The derivative (or in the case of integration, antiderivative) is always with respect to one or more variables; stating a problem as “dx = {non-differential function}” is meaningless unless you are applying the “nonstandard analysis” methodology in which the infinitesimal are represented by hyperreal numbers rather than placeholders for taking the limit of ratio of infinitesimals to get a real valued function or specific value at a point.

Practically speaking, engineers and other technical non-mathematicians treat it in the algebraic fashion for continuous single variable differentials and apply the chain rule explicitly for more complex derivatives, and don’t worry about the philosophical underpinnings, just like chemists use the periodic table without having to know how quantum mechanics specifies orbitals at discrete energy levels dependent upon the composition of the nucleus. So…don’t worry about it too much if you just want to use it, and if you do really want to understand it, look on Khan Academy or study proofs of fundamental calculus.


If you want to think of it as a fraction, you can… The pedantry which insists it isn’t is the misguided pedantry of “No, you can’t formalize it that way. You have to formalize it this way. (Even though both ways can be interpreted coherently, and lead ultimately to the same results…)”.

You can think of them as dx = an infinitessimally small change in x, and dy = the corresponding change in y (assuming y is a function of x). This is the way they thought of it in the early, nonrigorous days of Calculus, although Nonstandard Analysis is a relatively recent attempt at making this approach rigorous.

Or you can follow modern elementary Calculus textbooks that say that dy is, by definition, (dy/dx)*dx, and that this is an approximation of delta-y (the change in y) when y is a differentiable function of x and dx = a small delta-x.

dy/dx is the change of y with respect to changes in x, which is found by taking the limit of ∆y/∆x as the difference between elements of a function representing increments of x goes to 0. See this page for a succinct explanation for how taking the limit as ∆x→0 gives the derivative of f(x) = y.


Here, you don’t actually multiply by dx. You just integrate with respect to x:

[li]dy/dx = x[sup]2[/sup] - 2[/li][li]∫ dy/dx dx = ∫ x[sup]2[/sup] - 2 dx[/li][li]∫ 1 dy = ∫ x[sup]2[/sup] - 2 dx[/li][li]y = 1/3 x[sup]3[/sup] - 2x + C[/li][/ol]

The change of variable theorem is applied between the second and third steps.

The “multiply both sides by dx” is a convenient shortcut to the same answer, but the CoV theorem is what you’re actually applying when you do it out formally (or some non-standard analysis, which you’re probably not doing in first year calc).

Thanks for bringing up infinitesimals. I remembered them but since I haven’t done anything with calculus in a long time, my desire to study and my understanding of hyperreal numbers is lost.

You’re not (really) doing anything formally in first year calc (nor should you be), so you’re no more using any particular formalization than any other, or, more to the point, you’re no more not using any particular formalization than any other. If you’d like to think of yourself as actually multiplying both sides by some quantity dx and then adding up, go ahead. If you’d rather not think that way, then you can think some other way instead. (If you don’t want to think too much about it at all, that’s available to you as well…)

I hate the notation for derivatives that looks like a fraction, like this. The whole point of notation is to convey meaning clearly and avoid confusion.

Notation like f’(x) or derivative(f(x),x) works just as well and doesn’t look like a fraction. Looking like a fraction is especially problematic when you’re actually encouraged to do some things that seem to require that it is really a fraction, or suggest that it is really a fraction, or at least cloud the issue.

After all, most of the people who use calculus are the students studying it.

In fact, to the inevitable comments that say it can be written this way and the result works, I would like to make an example of the FORTH programming language, some versions of which let you give variables and subroutines names like 3 or |ll|l|| (a string of lowercase Ls and pipe characters). Creating a subroutine that returns 5 and naming it 3 lets you add 3 and 4 and get 9, for example. It can be written this way and the result works. Now, the point here is to comment on a notation as a notation - why in the world is this a good notation?

Problematic how? What’s the problem?

The notation arose for a reason. It does convey meaning; much more meaning, to someone who hasn’t already learnt everything, than is conveyed by “f’(x)” or “derivative(f(x), x)”! It’s clear that dy/dx has something to do with a ratio of changes in y to changes in x; from this it’s immediately apparent what the units of the derivative are in terms of the units of x and y (thus, how the derivative rescales when x or y are rescaled, how it reacts to addition of constants, how it demands that y and x each be the sorts of quantities of which we can meaningfully take differences but not necessarily that y and x each be the sorts of quantities which we can meaningfully add [so it would be meaningful, for example, for y or x to be temperature in Fahrenheit, but not for them to be, say, strings of letters]). The “chain rule” is made clear, the relationship between derivatives of inverse functions is made clear, and so on.

The only thing not immediately conveyed by the notation is over what interval the differences are taken. (Though sometimes a curse, this can be a boon, too, since we are not always solely interested in infinitesimal rates of change; we may think of dy/dx as assigning values not merely to single points, but to arbitrary intervals. And this is indeed what underlies the textbook calculation of dy/dx at even a single point: we calculate the value for intervals with distinct endpoints straightforwardly, and then extend by continuity to even intervals with both endpoints the same (or infinitesimally far apart, or however you’d like to think about it))

It’s sometimes claimed that this notation is inappropriate in the context of the multivariable chain rule because, e.g., “If z(t) = z(x(t), y(t)), then dz/dt = dz/dx * dx/dt + dz/dy * dy/dt, which is very different from the dz/dx * dx/dt or dz/dy * dy/dt the notation would suggest! [For example, if x(t) = t, y(t) = ln(t), and z(x, y) = x[sup]3[/sup] + 5y, then dx/dt = 1, dy/dt = 1/t, dz/dx = 3x[sup]2[/sup], dz/dy = 5, and dz/dt = 3x[sup]2[/sup] + 5/t, not merely 3x[sup]2[/sup] or merely 5/t]”.

But the problem here is not in treating all these ratio-type things as ratios. That part is totally fine!

Rather, the problem is that “d” is overloaded in such phrasing to stand for multiple different difference operators, which our partial derivative notation unfortunately (and, I would agree, awfully) obfuscates: in the phrasing of the above quote, “dz/dt”, we are imagining differences as t varies (and x and y vary along with it, and z along with that), whereas in “dz/dx” we are imagining differences as x varies but y is held constant, and in “dz/dy” we are imagining differences as y varies but x is held constant. In the first of these, x and y are yoked together by their dependence on t, while in the latter two, x and y are taken to independently vary (and thus there is no such thing as the quantity t).

If we explicitly labeled our difference operators to indicate their “partiality”, there would be no problem. The various relevant difference operators for this problem would be related through the linearity principle that total change in z is the sum of the change from varying x alone and the change from varying y alone (this is the characteristic of “total” differentiability), from which the multivariable chain rule follows straightforwardly in precisely the way the fractional notation would accurately indicate.

It’s one of the “nifty” features of Calculus, and in particular the Leibniz notation (dy/dx) that you can start with ∆y/∆x, then let ∆x→0, work through some algebra to see what it leads to as ∆x→0 (the “limiting process”) to find your derivative, and lo! and behold! you can get results that “appear” to let you treat dy/dx as a fraction in which the “numerator” and “denominator” can be treated as separate quantities.

(Just don’t try to cancel the “d” in the numerator and denominator, reducing dy/dx to y/x :dubious: )

Don’t they teach ε - δ definition of limits, and ε - δ proofs in First Semester Calculus anymore? When I took Calc I (circa 1982, Larson & Hostetler 2e), the definition was taught and a few trivially simple ε - δ proofs were shown, and we had exercise problems to do some other trivially simple ε - δ proofs. It was thus implied that all of the more seriously useful Calculus theorems (like the Chain Rule or all of L’Hôpital’s Rules, etc.) could be formally proved using ε - δ methods, although we didn’t actually do that.

Funny, I always thought it was. Distance over time. During one physics test, I got zero for an item when did simple division. The teacher said I should have expressed it using derivatives. I told him dividing was correct and that it WAS a derivative. He said i got the exact result purely by coincidence. I don’t claim to be a math or physics wizard but I realized I knew more math than at least one physics teacher.