Actually, I think for an engineer the following is a better motivating example.
Consider an ordinary differential equation. That is, you’ve got a “state” of a system in R[sup]n[/sup] and a collection of functions so that
dx[sub]k[/sub]/dt = f[sub]k/sub. Now, in a small enough region, the most important part of the functions are the linear parts df[sub]k[/sub]/dx[sub]j[/sub]. These fit into a matrix of functions, and evaluating at a given point you get a matrix of numbers. This matrix tells how, as time moves a little bit forward, points near the given point move around. The eigenvectors are the directions in which points move directly away or towards the given point, and the eigenvalues tell you how fast (negative says “move towards”, positive says “move away”). So, if the eigenvalues are all negative, then a sphere around the given point converges in towards the point as time goes forward. If they’re all positive, the same sphere expands. If they’re mixed, then the sphere expands in some directions and contracts in others.
Okay, fair warning: this may hurt a bit if someone isn’t ready for it. I’ll also kneecap the definition a bit so I don’t have to go into the fullest generality I can. For those who know and want an exercise, the first generalization is to rings and free R-modules.
Let F be a field, and V a finite-dimensional vector space over F. Consider the exterior F-algebra \Lambda(V) of alternating forms on V, which is graded with top grade dim(V). dim(\Lambda[sup]dim(V)/sup) = 1, so an element (an alternating dim(V)-multilinear form on V) is determined by its value on a list of dim(V) linearly independant vectors in V.
Now, lists of dim(V) vectors in V form a vector space which is isomorphic to hom[sub]F/sub. This is non-canonical, but if we take any two such isomorphisms and compose one with the inverse of the other, we get an inner automorphism of hom[sub]F/sub (as an algebra). Now an element of \Lambda[sup]dim(V)/sup is determined by its value on (the conjugacy class of) a single linear transformation, so we pick out the one that sends the identity I to 1. This is the determinant. Notice that I is fixed by all inner automorphisms, so in the end it turns out that we haven’t made any non-canonical choices at all.
I thought of a more intuitive way to say the same thing as before.
Consider very complicated linear transformations as compositions (matrix products) of reflections, projections, rotations, dilations and contractions.
Let’s look at the effect of a dilation by a factor of k on the unit sphere, ||(x,y,z)|| = 1. In this case we can see that every vector is an eigenvector because for all x, T(x) = kx. But you can only find at most three linearly independent eigenvectors. And of course, the canonical base for R[sup]3[/sup] is a basis for the eigenspace (because each vector in the canonical base is a linearly independent eigenvector with eigenvalue k) and so the matrix for this transformation is trivially diag(k,k,k).
Now, let’s look at a rotation of the unit sphere by θ degrees between 0 and 2π along an axis v. In this case we see that only two points (I’m identifying points with vectors here as usual) remain unchanged in direction (and length, although that is not important), the poles along v (with k=1). Every other vector transformation on the sphere cannot be represented by T(x) = kx (with k a real number). So we conclude that there is one linearly independent eigenvector (and the dimention on the eigenspace is one). In this case it’s not possible to diagonalize the matrix but if we chose the eigenvector as a basis for the vector space we are working on we could still simplify the matrix for this transformation. Only one of its rows would diagonalize. For example if we chose v for the first basis vector the first row would become (1,0,0).
We could do the same study for the other kinds of linear transformations and I think it would be easy to prove that the dimention of the eigenspace of the composition of all linear transforms T[sub]1[/sub] o T[sub]2[/sub] o T[sub]3[/sub] o … o T[sub]n[/sub] is equal to the smallest dimention of the eigenspace of each transformation but I haven’t worked out the details and I might be wrong.
Thanks, everybody. I wish there were math textbooks out there that did a bit more of this in between theorems, because it’s stuff like the last 10 posts or so that remind me why I used to enjoy learning math. It’s much easier to learn things when you can look at it as a cool new way to solve a problem you couldn’t solve before. From my experience tutoring (high school level, mostly), this approach can often make the material “click” for a kid despairing of ever being “good at math.” My calculus teacher in high school always claimed limits were the toughest thing we’d have to learn in his course, and the way he taught it, for most people it was. I was a bit ahead of the class at that point, and I helped a lot of my friends get their heads around limits just by starting with Zeno’s Paradox. It’s a neat problem to think about, and perfect for introducing the need for such a concept.
Sometimes when I listen to myself talk, I start to wonder whether I’d rather be a teacher than an electrical engineer .
True, linear differential equations are probably the most common engineering appearances of matrices where the eigenvectors are useful. Your point about the signs (of the real parts) of the eigenvalues is especially relevant to stability issues in dynamics and control theory. One other major appearance of matrix eigenvalues I can think of right now is for the covariance matrix for Gaussian distributions. Here the eigenvectors give the principal-axis directions for the covariance ellipsoid of the multidimensional random variable; the univariate random variables oriented along these axes are uncorrelated.