9

Let $T$ be a linear transformation of an $n$-dimensional vector space $V$ over a field $k$. It's pretty easy to define the minimum polynomial of $T$ and make sure its degree is between $1$ and $n^2$, inclusive.

Observe $ I = \{ p(x) \in k[x] : p(T) =0\}$ is an ideal in $k[x]$. Indeed, $I$ is the kernel of the evaluation homomorphism $\mathrm{eval}_T: k[x] \to \mathrm{End}(V)$. Notice also that:

  • $\mathrm{eval}_T$ is unital homomorphism, so $I$ is a proper ideal.
  • The $n^2 + 1$ transformations $I, T,T^2,T^3,\ldots, T^{n^2}$ must be linearly dependent, since $\mathrm{dim}(\mathrm{End}(V)) = n^2$, so there exist scalars $a_0,\ldots,a_{n^2}$, not all zero, such that $a_0I + a_1 T + \ldots + a_{n^2}T^{n^2} = 0$, whence the nonzero polynomial $p(x) = a_0 + a_1 x + \ldots + a_{n^2}x^{n^2}$ belongs to $I$.

Since $k[x]$ is a p.i.d., we may define the minimum polynomial $m(x)$ of $T$ to be the monic generator of the ideal $I$. By the preceding two observations, we have $1 \leq \mathrm{deg}(m(x)) \leq n^2$.

Now, of course, we know that the degree of $m(x)$ actually satisfies $1 \leq \mathrm{deg}(m(x)) \leq n$. One way to see this is to use the Cayley-Hamilton theorem which shows that the characteristic polynomial $c(x)=\det(xI - T)$, whose degree is $n$, annihilates $T$, whence $m(x)$ divides $c(x)$.

Question: Is there another way to see that $T$ is annihilated by a polynomial of degree $\leq n$ which does not require use of the characteristic polynomial?

mechanodroid
  • 46,490
Mike F
  • 22,196
  • 1
    You could put it in rational canonical form (works over any field). Or, if you are working over the complex numbers, then you can read off the minimal polynomial if you put it in triangular form: if $a_1,\dots,a_n$ are the diagonal entries, then the minimal polynomial is simply $(x-a_1)\cdot\dots\cdot (x-a_n)$.) – Amitesh Datta Apr 29 '15 at 21:55
  • @egreg I guess this depends on the specific statement of the Cayley-Hamilton theorem to which you are referring. I guess one formulation is that the characteristic polynomial of an operator $T$ annihilates $T$, and another (weaker) statement is that there is some polynomial of degree $\leq n$ (if $T$ is an operator on $n$-space) which annihilates $T$. The first statement does not follow from the second, so this isn't necessarily reproving the Cayley-Hamilton theorem. – Amitesh Datta Apr 29 '15 at 21:59
  • @AmiteshDatta: I like the idea putting the transformation in triangular form (if working over $\mathbb{C}$) a lot. I don't quite agree with the statement that you can get the minimal polynomial by reading off the diagonal entries of a triangular matrix, though. Did you perhaps mean to say "characteristic polynomial" in your first comment? – Mike F Apr 29 '15 at 22:02
  • Dear @MikeF, thank you for the correction! Yes, I meant to say "characteristic polynomial". (Although it is true that you can read off the minimal polynomial too, since it will be a product of some of the $(x-a_i)$'s.) – Amitesh Datta Apr 29 '15 at 22:05

4 Answers4

7

Proposition 1.10 of these middle-brow linear algebra notes reproduces an inductive proof of M.D. Burrow.

Here "middle-brow" means: no modules, no tensor products, no change of ground field or use of an algebraically closed ground field until the aftermath of all the main results*; but it does use arithmetic in the univariate polynomial ring $F[t]$ and also quotient spaces.

*: For the majority of the notes, that a matrix has any eigenvalues at all is a pleasant special case!

user1551
  • 139,064
Pete L. Clark
  • 97,892
7

I like this approach for really explaining why the result is true. For each vector $v$, define the minimal polynomial of $T$ on $v$ to be the monic polynomial of least degree in $T$ that annihilates $v$.

Lemma: If $P$ is the minimal polynomial of $T$ on the vector space $V$, there is some vector $v$ such that the minimal polynomial of $T$ on $v$ equals $P$.

proof sketch: decompose $V$ as a direct sum of kernels of powers of irreducible factors of the minimal polynomial. The lemma follows easily for each summand; then add to get a vector $v$ for all of $V$.

Then since evaluation at $v$ is a linear map from the polynomial ring to the vector space, it follows that the kernel has codimension $\le \dim(V)$, hence the minimal polynomial of $T$ on $v$ has degree $\le \dim(V)$.

mechanodroid
  • 46,490
roy smith
  • 1,502
3

Take the tensor product of the vector space with the field of rational functions $k(x_1,x_2,\ldots,x_n)$ and consider the vector $v=(x_1,x_2,\ldots,x_n)$. The vectors $v,Tv,\ldots,T^nv$ are linearly dependent over the field of rational functions, and by clearing denominators we see that there are polynomials $p_0,p_1,\ldots,p_n$, at least one of which is nonzero, such that $$p_0v+p_1Tv+\cdots+p_nT^nv=0$$ Let $m$ be the highest power such that $p_mT^mv$ is nonzero. Let $a$ be some monomial with nonzero coefficient in $p_m$, and consider the Laurent polynomial coefficients of $$a^{-1}(p_mT^mv+\cdots+p_0v)$$ Taking the constant terms is a nontrivial linear combination of $T^mv,T^{m-1}v,\cdots,v$ with coefficients $a_0,a_1,\ldots,a_m$ in $k$ and by homogeneity it must be equal to 0. Specializing the variables can yield any vector in the original vector space. It follows that $a_mT^m+a_{m-1}T^{m-1}+\cdots+a_0I$ is identically zero as a linear transformation, proving the result.

Matt Samuel
  • 58,164
  • This answer is very interesting, but I am finding a few problems with it. Issue 1: It looked for a moment as though you were going to work coordinate free (taking the tensor product of $k(x_1,\ldots,x_n)$ with $V$), but then you spoke about the vector $v=(x_1,x_2,\ldots,x_n)$, so I guess you are implicitly identifying $V = k^n$ so that the tensor product under consideration is identified with $k(x_1,\ldots,x_n)^n$. – Mike F Apr 30 '15 at 00:29
  • Issue 2: I think there is a problem with the sentence "let $m$ be the highest power such that $p_m T^m v$ is nonzero", albeit one that is easily fixed. – Mike F Apr 30 '15 at 00:33
  • For instance, if $T=0$, then we can take $p_0 = 0$, $p_1 = \ldots = p_n =1$ and have $p_m T^m v = 0$ for every $m$. However, there is no harm in assuming $T^n \neq 0$ (so $T,T^2,\ldots,T^{n-1} \neq 0$ too) since otherwise $x^n \in k(x)$ annihilates $T$ and we are done. Your choice of $v$ is such that, for a linear transformation $S$ of $V$ (so the matrix coefficients of $S$ are in $k$) one has $Sv=0$ if and only if $S=0$. So under, the assumption that $T^n \neq 0$, we have $p_m T^m v = 0$ if and only if $p_m = 0$ so that the $m$ you refer to is really the largest $m$ such that $p_m \neq 0$. – Mike F Apr 30 '15 at 00:33
  • Issue 3: I was confused about the step where you mapped the Laurent polynomials (normalized to ensure some of them possessed nonzero constant terms) to their constant terms. I was concerned because, unlike for vanilla multivariate polynomials, taking the constant term is not given by evaluating at $x_1=\ldots=x_n=0$, and is not a homomorphism. However, now that I write down this complaint, I think maybe it does not matter since you really only seem to be using the fact that taking the constant term is linear. Perhaps it is OK that this operation it is not multiplicative. – Mike F Apr 30 '15 at 00:42
  • You can use the graded ring structure of Laurent polynomials. The vector has each component as one of the variables in the field of rational functions. – Matt Samuel Apr 30 '15 at 00:52
  • Or rather linear independence of the monomials. – Matt Samuel Apr 30 '15 at 01:04
  • Thanks, I'm OK with "Issue 3" now. Issues 1 and 2 are pretty minor, but it might be worth editing the answer a bit to make it 100% correct. Anyway I am accepting this answer, I think this is a very satisfying (and, in hindsight, natural) way to proceed. In the case $k = \mathbb{C}$, I think @AmiteshDatta's suggestion to put the transformation in upper triangular form is also very good, since the existence of this form is also quite easy to establish (assuming the fund. thm. of algebra), but this is very nice for the general case. Much nicer than using any kind of normal form, in my opinion. – Mike F Apr 30 '15 at 01:20
  • Thank you, extra characters. – Matt Samuel Apr 30 '15 at 01:32
2

This is an elaboration on roy smith's answer.

For any $x\in V$, we define the local minimal polynomial of $T$ at $x$ as the monic polynomial $m_x$ of the least degree such that $m_x(T)x=0$. Just like the case for the usual minimal polynomial, it can be shown that $m_x$ divides every polynomial $g$ such that $g(T)x=0$.

Let $p_1^{r_1}\cdots p_m^{r_m}$ be a factorisation of the minimal polynomial of $T$ into powers of distinct irreducible factors. Let $q_i=\prod_{j\ne i}p_j^{r_j}$. Note that $V$ is equal to the sum $\ker p_1^{r_1}(T)+\cdots+\ker p_m^{r_m}(T)$, because $\alpha_1q_1+\cdots+\alpha_mq_m=1$ for some polynomials $\alpha_1,\ldots,\alpha_m$ and $\alpha_i(T)q_i(T)x\in\ker p_i^{r_i}(T)$ for every vector $x$. This sum is also a direct sum, for, if $p_i^{r_i}(T)x=0=p_j^{r_j}(T)x$, then $m_x$ divides both $p_i^{r_i}$ and $p_j^{r_j}$. Hence $m_x=1$ and $x=0$.

Knowing that $V$ is a direct sum of those kernels, it remains to show that $\dim\ker p_i^{r_i}(T)\ge\deg\left(p_i^{r_i}\right)$. Suppose $i=1$. Since $p_1^{r_1}\cdots p_m^{r_m}$ is the minimal polynomial of $T$, there exists some $x\in V$ such that $(p_1^{r_1-1}q_1)(T)x\ne0$. Let $y=q_1(T)x$. Then $p_1^{r_1-1}(T)y\ne0=p_1^{r_1}(T)y$. Hence $m_y=p_1^{r_1}$ and by the minimality of $m_y$, $y,\,Ty,\,T^2y,\,\ldots,\,T^{r_1\det(p_1)-1}y$ are linearly independent members of $\ker p_1^{r_1}(T)$. The conclusion now follows.

user1551
  • 139,064