Need a good definition of derivatives

In another thread I mentioned that I would be teaching myself calculus. Well, after a hiatus, I’ve restarted my studies. The problem that I’ve had so far is, that, although I know how to do derivatives, I’m not really sure what they are. The book I’m reading said their “the slope of line tangent to a function.” (or something along those lines.) But don’t many functions (if not all) have more than one line tangent to them? I need a better explanation of what a derivative is in words.

Also, the notation isn’t really explained in the book. When a derivative is denoted by the “dx/dy” what does that signify as opposed to “du/dx” or “dy/dx.” What does the position of the variables mean in that notation?

Pick an x coordinate, call it a. We want to find the derivative of f at a (the slope of the tangent line of f at a). Let’s pick another x coordinate near a, and call it a+h. We can find the slope (rise/run) of the line connecting these two points on the graph of f:

f(a+h) - f(a)

a+h-a

f(a+h) - f(a)

h

Now take the limit of this as h approaches zero (i.e., as the second point on the graph approaches the original point, x=a). The slope of the corresponding lines will approach the slope of the tangent line (so long as the function is differentiable at a in the first place).

The notation you mention, dy/dx, indicates that this derivative expresses the instantaneous rate of change of y with respect to x (in other words, rise over run, like you’re used to seeing with slope). Similarly, dz/dt would be the instantaneous rate of change of z with respect to t, and so forth.

Cabbage, thats pretty much how the book explained it (except in much greater detail.) I guess I need something a little more verbal as to what you get by taking a derivative. The “rise over run” part was good though.

Well as usual Cabbage has beaten me to the punch, so I’ll just point to some pretty pictures at this website instead.

You note that most functions have more than one tangent line. The point is that ( nice) functions have one tangent line at each point on them. Derivatives give a simple formula for the gradient of each of these tangents.
For example, you will be aware that the derivative of x[sup]2[/sup] is 2x. This means that to find the gradient of the tangent to any point on this graph you simply double the x-coordinate. For example, the gradient of the tangent at the point (2,4) is 4; the gradient of the tangent at (3,9) is 6; and the gradient of the tangent at (12,144) is 24. If you examine the behaviour of the function 2x you will see that for large negative x the gradient is large and negative ( so the tangent is a steep downward-pointing line); as x increases towards 0 the gradient approaches 0 ( so the tangents become less steep); when x=0, the gradient is 0 ( so the tangent is horizontal); then as x increases further the gradient increases ( so the tangents become increasingly steeper). If you now sketch the curve y = x[sup]2[/sup] and draw in some of its tangents you will see that the above reflects our intuitive view of the behaviour of the tangents.

The notation dy/dx means simply that x is the independent variable and y is the dependent variable, so we are considering y as being defined in terms of x. So for example if y = x[sup]2[/sup] then dy/dx = 2x. If y = u[sup]2[/sup], then dy/du = 2u. If v = u[sup]2[/sup] then dv/du = 2u.

The derivative dy/dx can also be interpreted as the rate of change of y with respect to x. For instance if x represents the displacement of a particle and if t represents time then dx/dt represents the rate of change of displacement with respect to time, i.e. the velocity. Similarly, if v represents the velocity then dv/dt represents the rate of change of velocity, i.e. the acceleration. If P represents the price of an object, then dP/dt represents the rate of change of price, i.e. inflation. It is this interpretation that leads to the wide applicability of calculus in other scientific disciplines.

A derivitive is merely the equation for the slope of another equation, normally denotated by a ’ pronounced prime. For example F’ of F(x) is the slope of F(x) at every x (1 for the whole graph). Think of a car driving on your graph, where the head lights are pointed is the slope at that point.

Yes, they do. Each point on the function (usually) has a line tangent to it. Each of those tangent lines has a slope. Make a table of those slopes for different values of x. Make a graph from that table. You’ve just made a graph of the derivative of the function.

Of course, you usually don’t actually make a table of slopes. You can calculate a formula that will give you the slope for any value of x. That’s what you’re doing when you take the derivative of the function.

Gee, when I re-read that it sounds kind of patronizing or offensive. It’s worded in a way that might insult someone’s intelligence. That’s not what I was trying to do. I was just trying to see if I could explain a derivative in the simplest possible way.

The derivative of a function with respect to a variable is the sensitivity of that function’s value to the variable. For example if a function “f” is defined by f(x)=x^2, then the derivative of that function is f’(x)=2x, meaning that if if x changes by a little bit, the function changes by 2xthat little bit.

The derivative of a function is always another function. If you want to label the derivative of the function f, calling it f’ is a nice and unambiguous way of doing it. The “dy/dx” notation makes it look like a fraction, which strictly speaking it isn’t. I think it’s unfortunate the dy/dx notation caught on. It came from Newton.

To put it in a sentence, a derivative is the rate a function is changing at.

–John

Good explanation! If you have a particle moving along a line, and y represents the position when the time is x, dy/dx is the velocity as a function of x.

dx and dy are holdovers. Originally the d meant an infinitesimal increment. You don’t usually see dx/dy, unless you are talking about the inverse function: x = f[sup]-1/sup.

Well, first of all, the dy/dx notation was due to Leibniz, not Newton. Newton’s notation was god-awful and is used only in some basic physics texts. The dy/dx notation is actually quite nice, especially when you get into higher derivatives. It’s also an easy way to remember the chain rule. Only problem is to remind students that no division is taking place and that it’s only notation to represent the derivative of y=f(x) with respect to x. As was pointed out already, another common notation is y’ = f’(x) to represent the derivative of y with respect to x.

Yes, the derivative of a function y=f(x) at a point x=a is the instantaneous (not the average!) rate of change of y at the point (a, f(a)). Geometrically, it’s the slope of the tangent line to the graph of y=f(x) at the point (a,f(a)). This is also why a function does not always have a derivative at a point. When the slope cannot be determined without ambiguity (many reasons for this), then the derivative does not exist. In econ, the derivative at x=a is usually called the marginal value of y when x=a.

The example I usually use to explain the difference between instantaneous vs. average rate of change is as follows: Suppose you’re driving from city A to city B, sometimes faster, sometimes slower. Suppose you have a function y=f(x) that describes exactly how far you’ve driven after x hours. So y is distance travelled (say, in miles) after you’ve been driving for x hours. What’s the derivative here? Well, it depends on x. After x=a hours, you’ve travelled f(a) miles. The derivative f’(a) is the speedometer reading (in mph) at exactly x=a hours. That’s the instantaneous rate of change of distance with respect to time. That’s the derivative when x=a. The average rate of change is just f(a)/a, which is not f’(a). There is a way to use the average rate of change to “creep” up to the instantaneous rate of change – that’s the concept of the limit and the use of the difference quotient. And that’s the crux of the formal definition of the derivative. Hope this helps.

Qwertyasdfg, have fun studying calculus. It can be challenging to do it via self-study, but it can be done!

even though dy/dx is not a division, it is still a useful notation since dy/dx f(x) = dx/dy (1/fx) and also dy/dx = fx => dy = fx dx when you get onto differential equations

In almost all cases[sup]*[/sup], thinking of dy/dx as a fraction will work just fine: It’s a small piece of x divided by the corresponding small piece of y. Similarly, you can generally get away with things like cross-multiplication: If you have the equation dy/dx = 3x[sup]2[/sup], then you can cross-multiply to get dy = 3x[sup]2[/sup]dx. Again, this is not strictly correct, but it usually works.

Where you need to be careful is when you’re taking partial derivatives, where these rules no longer apply: For instance, if [symbol]¶[/symbol]y/[symbol]¶[/symbol]x = 0, then [symbol]¶[/symbol]x/[symbol]¶[/symbol]y = 0 as well. I realize that you haven’t seen partial derivatives yet (they don’t show up until you get to multivariable calculus), but keep that in mind for future reference.
*By this, I mean “almost all functions that you’ll ever actually see”, not “almost all functions that exist”. Only an infinitely small fraction of functions are what a physicist would call “nice” functions, but humans seldom do much with the non-nice ones.

One application of derivatives that I find useful in understanding what derivatives are is the relationship of position, velocity and acceleraiton. If you have an equation for the position of an object in relation to time, the derivative of that equation will give you it’s velocity in relation to time. The derivative of the velocity equation will give you the object’s acceleration in relation to time.

Integration by parts and the chain rule only ever serve to reinforce the idea that dx and all d* bits are values in themselves. I’ve never understood the problem with this interpretation of events since the equations are often written that way (as in integrals).

But, whatever. There isn’t much more to say on derivatives that hasn’t been said, but I’m going to explain it the way it was always explained to me.

We have a ball whose position at time t is given by the function f(t). We want to know its speed at any instant.

Now, the normal velocity function we would use would be (change in distance)/(change in time). In terms of our problem, this would be [ f(t) - f(t[sub]1[/sub]) ] / [t - t[sub]1[/sub]]. Now, in the above equation, t[sub]1[/sub] is some real value, like, say, 4. But this equation only gives us the average velocity between t and t[sub]1[/sub]. We want the actual velocity at t[sub]1[/sub].

Well, we can approximate it, couldn’t we? if t[sub]1[/sub]=4, we could make t = 4.1, then get the average velocity, then make t = 4.01, then get the average velocity, and so on.

But hey, this is just like the idea of limits that we studied already! In fact, aren’t we asking that t->t[sub]1[/sub]? And then won’t we have the exact velocity there? Written mathematically

f’(t) = lim[sub][sup]t->t[sub]1[/sub][/sup][/sub] [sup]f(t) - f(t[sub]1[/sub])[/sup] / [sub]t - t[sub]1[/sub] [/sub]

It is most intuitive for us to understand derivatives in velocity where the function is the position function, but of course in general the derivative is the rate of change of the y variable with respect to the x variable (in the cartesian coordinate system), whatever they may be. It simply takes the successive average change and creates a limit of a function for it.

Ok, I’m clear on the answer to my first question.

But, I’m still not clear on notation (the book I’m reading is unclear about this, and it’s getting confusing.) Can I express dx/dy in the f ’ form? If so, is dx/dy = f '(x) or f ’ (y)? Can the chain rule be written in f ’ (x) form?

Thanks to everyone for the helpful responses, keep em comin’.

This page may help in clarifying the differences.

As has been said, the dy & dx are due to Leibniz, and are called differentials (hence differential equations). The way to express them in combination with f’ is like ths :

dy = f’(x)dx

You can see that treating them as fractional leads to :

dy/dx = f’(x)

One other place this notaion is more useful is when doing implicit differentiation. This is when you have a something specified as :

xy[sup]2[/sup] - 4y + (x-1)[sup]2[/sup] = 0

and want to get dy/dx. (Essentially you end up with dy/dx’s in your equation where you have a y, which is a bit easier than putting in f’(x)'s for your y’s.)

Chain rule: D[sub]x[/sub]f(g(x)) = f’(g(x)) * g’(x)

If you want to know what dx/dy is you need to solve for x in terms of y. For example:

y = f(x) = x[sup]3[/sup] and
x = g(y) = cuberoot

And “dx” and “dy” mean the x and y coordinates of the graph of the derivative?

And “dx” and “dy” mean the x and y coordinates of the graph of the derivative?

No, you can think of dx as a small change in the x coordinate and dy as the corresponding small change in the y coordinate, with dy = f’(x)dx. As I’ve mentioned before, all these operations with dy and dx are really of an informal nature. Rigorously speaking, dy and dx are infinitisimal quantities. But dy = f’(x)dx works fine if dx is small and f’(x) exists. I would suggest that you first master the concept of the limit of a function before trying to understand exactly what’s a derivative.