Lets say I have a two by two sqare matrix where all the elements are less than one. Lets say I start raising this matrix by greater and greater powers, so I’ve got [A][sup]2[/sup], [A][sup]10[/sup], etc. I note as the powers get greater the matrix converges on a limit so the there is a limit for [A][sup]n[/sup] as n goes to infinity.
Why is that? I know this is probably a stupid question, but you all have helped me with stupid questions in the past, so I’m hoping a mathematically inclined doper will take pity on me.
I don’t think that what you’ve said is actually true. It seems that if all the entries in the matrix A are less than 0.5, then the limit of [A][sup]n[/sup] would be the zero matrix. But if the entries are greater than 0.5, you could get one that increases forever.
Consider the case where the components of A are all between 0 and 0.5
Then, the components of A[sup]2[/sup] should be less than 0.50.5+0.50.5 = 0.5
In other words, we’re making the components of the matrix smaller by squaring it, and even smaller with each successive multiplication.
However, it’s not hard to see that this doesn’t hold for a matrix whose components are greater than 0.5.
Right, it’s not generally true, although it is true of certain matrices (including all stochasic matrices, those for which each column sums to one and all entries are non-negative). But for most matrices, A[sup]n+1[/sup] is approximately kA[sup]n[/sup] for large n. This can be seen from the fact that exponentiating a diagonalizable matrix is equivalent to exponentiating its eigvalues while leaving its eigenvectors unchanged (for a stochastic matrix, the eigenvalue with largest real part must be equal to one). But this won’t make any sense if you’re not familiar with eigenvalues and eigenvectors.
Right, I have to admit I mostly discovered this playing around with my old TI-83. I assumed that if the elements were less than one that you would get a zero matrix. But with a matrix whose elements were:
Row 1 [ 9/14 5/14]
Row 2 [5/8 3/8]
the matrix appeared to converge on
I note on preview that this appears to meet Josh de Plume’s criteria except its the rows that add to one, not the columns.
Is there any simple explanation for this? I appreciate your reply, Josh (and everyone else of course) but I must admit it went a bit over my head.
I’ll have to think about this. I note that 11 is equidistant from 14 and 8, 7 from 9 and 5, and 4 from 5 and 3. I’m not sure what this has to do with anything.
Well, I’m not sure that this will help, but I’ll give it a try. Many matrices A–in a certain sense, most matrices–can be written as VDV[sup]-1[/sup], where D is some diagonal matrix (all elements off the main diagonal are zero; the effect of multiplying a vector by such a matrix is to multiply each element by a certain constant). The diagonal elements of D are the eigenvalues of the matrix, and the columns of V are the corresponding eigenvectors (you didn’t say whether you’re familiar with eigenstuff). It’s easy to show that A[sup]n[/sup]=VD[sup]n[/sup]V[sup]-1[/sup] (try showing this for n=2). For large n, the element of D with the largest absolute value will come to dominate. As a special case, if 1 is an eigenvalue of the matrix and all the other eigenvalues have absolute value less than 1 (which will always be true for a stochastic matrix, or for the transpose of one), for large n the elements of D[sup]n[/sup] will be 1 where the corresponding element of D is 1, and will approach 0 otherwise. So A[sup]n[/sup] converges.
Does your calculator do eigenvalues and eigenvectors? You can see that the columns of your limit are a (right) eigenvector of your original matrix, and the rows are a left eigenvector, both associated with the eigenvalue 1.
Thanks. I think I can work my way through this, but it will take a while. I’ll have to review eigenstuff. I’ve got a couple of books I can look at. Your reply pointed me in the right direction, and it gave me an idea of the complexity of the problem. I appreciate it.
Glad to be of help. I should also point out that the bit about diagonalizing matrices is fairly profound. It means that the linear transformation associated with the matrix corresponds, with appropriate choice of basis set, to simply multiplying each element by a constant. The effect of applying the the tranformation to a vector–that is, multiplying the vector by the matrix–can be understood as follows: 1. Express the vector as a linear combination of the eigenvalues. 2. Multiply each component by the associated eigenvalue. It shouldn’t be surprising that repeated application of the transformation, which corresponds to multiplying by a power of the matrix, corresponds to repeatedly multiplying the components by the eigenvalues. It’s also easy to see that eventually, if one eigenvalue has absolute value larger than all the others, its component will come to dominate.
You can do lots of cool things with the VDV[sup]-1[/sup] form. Want a square root of the matrix? Just take the square roots of the eigenvalues (for the nonzero ones, you can take either of the two square roots, so there may be many matrix square roots). Other matrix functions can be defined like this. Of particular importance is the matrix exponential, exp(A), which correspondes to taking the exponentials of all the eigenvalues. This comes up in solving differential equations.
If you’re familiar with the convolution theorem, you can think of the Fourier transform as a change of basis into a basis where convolution is diagonal. Convolving with a fixed thing is a linear transformation, and the eigenvectors of all convolutions are exactly the imaginary exponentials that make up the components of the frequency domain.