7  Canonical Forms

\(\newcommand{\vlist}[2]{#1_1,#1_2,\ldots,#1_#2}\) \(\newcommand{\vectortwo}[2]{\begin{bmatrix} #1 \\ #2\end{bmatrix}}\) \(\newcommand{\vectorthree}[3]{\begin{bmatrix} #1 \\ #2 \\ #3\end{bmatrix}}\) \(\newcommand{\vectorfour}[4]{\begin{bmatrix} #1 \\ #2 \\ #3 \\ #4\end{bmatrix}}\) \(\newcommand{\vectorfive}[5]{\begin{bmatrix} #1 \\ #2 \\ #3 \\ #4 \\ #5 \end{bmatrix}}\) \(\newcommand{\lincomb}[3]{#1_1 \vec{#2}_1+#1_2 \vec{#2}_2+\cdots + #1_m \vec{#2}_#3}\) \(\newcommand{\norm}[1]{\left|\left |#1\right|\right |}\) \(\newcommand{\ip}[1]{\left \langle #1\right \rangle}\) \(\newcommand{\plim}[2]{\lim_{\footnotesize\begin{array}{c} \\[-10pt] #1 \\[0pt] #2 \end{array}}}\)

7.1 Invariant Subspaces

In this section we let \(V\) and \(W\) denote real or complex vector spaces.

Suppose \(T \in \mathcal{L}(V)\) and \(U\) a subspace of \(V\), then we say \(U\) is an invariant subspace of \(T\) if \(u\in U\) implies \(Tu\in U\).

Lemma 7.1 Suppose \(T \in \mathcal{L}(V)\) and \(U\) a subspace of \(V\). Then all of the following hold

  • \(U\) is invariant under \(T\) if and only if \(T|_U\) is an operator on \(U\),
  • \(\text{ker} T\) is invariant under \(T\), and
  • \(\text{im} T\) is invariant under \(T\).

Proof. The proof of each part follows.

  • By definition, invariant subspace \(U\) is invariant under \(T\) if and only if \(u\in U \implies Tu\in U\), which, by definition of operator, is the same as \(T|_U\) being an operator.
  • If $u T $ then \(Tu=0\), and hence \(Tu\in \text{ker} T\).
  • If $u T $, then by the definition of range \(Tu\in\text{im} T\).

Theorem 7.1 Suppose \(T\in \mathcal{L}(V)\). Let \(\lambda_1,\ldots,\lambda_m\) denote distinct eigenvalues of \(T\). The the following are equivalent:

  • \(T\) has a diagonal matrix with respect to some basis of \(V\);
  • \(V\) has a basis consisting of eigenvectors of \(T\);
  • there exist one-dimensional \(T\)-invariant subspaces \(U_1, \ldots, U_n\) of \(V\) such that \(V=U_1 \oplus \cdots \oplus U_n\);
  • \(V=\null (T-\lambda_1 I) \oplus \cdots \oplus \null (T-\lambda_m I)\);
  • \(\text{dim} V = \text{dim} \null (T-\lambda_1 I) + \cdots + \text{dim} \null (T-\lambda_m I)\).
  1. Proof.
  2. \(\Longleftrightarrow\) (ii): Exercise.

  3. \(\Longleftrightarrow\) (iii): Suppose (ii) holds; thus suppose \(V\) has a basis consisting of eigenvectors of \(T\). For each \(j\), let \(U_j=\text{span}(v_j)\). Obviously each \(U_j\) is a one-dimensional subspace of \(V\) that is invariant under \(T\) (because each \(v_j\) is an eigenvector of \(T\)). Because \((v_1,\ldots,v_n)\) is a basis of \(V\), each vector in \(V\) can be written uniquely as a linear combination of \((v_1,\ldots,v_n)\). In other words, each vector in \(V\) can be written uniquely as a sum \(u_1+\cdots +u_n\), where each \(u_j\in U_j\). Thus \(V=U_1 \oplus \cdots \oplus U_n\). Hence (ii) implies (iii). Conversely, suppose now that (iii) holds; thus there are one-dimensional subspaces \(U_1,\ldots,U_n\) of \(V\), each invariant under \(T\), such that \(V=U_1 \oplus \cdots \oplus U_n\). For each \(j\), let \(v_j\) be a nonzero vector in \(U_j\). Then each \(v_j\) is an eigenvector of \(T\). Because each vector in \(V\) can be written uniquely as a sum \(u_1+\cdots + u_n\), where each \(u_j\in U_j\) ( so each \(u_j\) is a scalar multiple of \(v_j\)), we see that \((v_1,\ldots,v_n)\) is a basis of \(V\). Thus (iii) implies (ii).

  4. \(\Longrightarrow\) (iv): Suppose (ii) holds; thus thus suppose \(V\) has a basis consisting of eigenvectors of \(T\). Thus every vector in \(V\) is a linear combination of eigenvectors of \(T\). Hence \(V=\null (T-\lambda_1)I + \cdots + \null (T-\lambda_m)I.\) To show that the sum above is direct, suppose that \(0=u_1+\cdots + u_m\), where each \(u_j\in \null (T-\lambda_j I)\). Because nonzero eigenvectors correspond to distinct eigenvalues are linearly independent, this implies that each \(u_j\) equals 0. This implies that the sum is a direct sum, completing the proof that

  5. implies (iii).

  6. \(\Longrightarrow\) (v): Exercise.

  7. \(\Longrightarrow\) (ii): Suppose (v) holds; thus \(\text{dim} V=\text{dim} \null (T-\lambda_1I)+\cdots + \text{dim} \null (T-\lambda_m I)\). Choose a basis of each \(\null(T-\lambda_j I)\); put all these bases together to form a list \((v_1,\ldots,v_n)\) of eigenvectors of \(T\), where \(n=\text{dim} V\). To show that this list is linearly independent, suppose \(a_1 v_1+\cdots + a_n v_n=0\), where \(a_1,\ldots,a_n\) are scalars. For each \(j=1, \ldots., m\), let \(u_j\) denote the sum of all the terms \(a_k v_k\) such that \(v_k\in \null(T-\lambda_j I)\). Thus each \(u_j\) is an eigenvector of \(T\) with eigenvalue \(\lambda_j\), and \(u_1+\cdots + u_m=0\). Because nonzero eigenvectors corresponding to distinct eigenvalues are linearly independent, this implies that each \(u_j\) equals to 0. Because each \(u_j\) is a sum of terms \(a_k v_k\) where the \(v_k\)’s where chosen to be a basis of \(\null (T-\lambda_j I)\), this implies that all the \(a_k\)’s equal to 0. Thus \((v_1,\ldots,v_n)\) is linearly independent and hence a basis of \(V\). Thus (v) implies (ii).

A vector \(v\) is called a generalized eigenvector of \(T\) corresponding to \(\lambda\), where \(\lambda\) is an eigenvalue of \(T\), if \((T-\lambda I)^j v=0\) for some positive integer \(j\).

Lemma 7.2 If \(T\in \mathcal{L}(V)\). Then, if \(m\) is a nonnegative integer such that \(\text{ker} T^m=\text{ker} T^{m+1}\), then

  • \(\{0\}=\text{ker} T^0 \subseteq \text{ker} T^1 \subseteq \cdots \subseteq \text{ker} T^m = \text{ker} T^{m+1} = \text{ker} T^{m+2} = \cdots\)
  • \(\text{ker} T^{\text{dim} V}=\text{ker} T^{\text{dim} V+1} =\text{ker} T^{\text{dim} V+2}\cdots\)
  • \(V=\text{im} T^0 \supseteq \text{im} T^1 \supseteq \cdots \supseteq \text{im} T^k \supseteq \text{im} T^{k+1} \supseteq \cdots\)
  • \(\text{im} T^{\text{dim} V}= \text{im} T^{\text{dim} V+1} = \text{im} T^{\text{dim} V+2}\cdots\)

Lemma 7.3 Suppose \(T\in \mathcal{L}(V)\) and \(\lambda\) is an eigenvalue of \(T\). Then the set of generalized eigenvalues of \(T\) corresponding to \(\lambda\) equals \(\text{ker}(T-\lambda I)^{\text{dim} V}\).

Proof. The proof is left for the reader.

An operator is called nilpotent if some power of it equal 0.

Lemma 7.4 Suppose \(N\in \mathcal{L}(V)\) is nilpotent, then \(N^{\text{dim} V}=0\).

Proof. The proof is left for the reader.

Theorem 7.2 Let \(T\in \mathcal{L}(V)\) and \(\lambda\in\mathbb{F}\). Then for every basis of \(V\) with respect to which \(T\) has an upper-triangular matrix, \(\lambda\) appears on the diagonal of the matrix of \(T\) precisely \(\text{dim} (T-\lambda I)^{\text{dim} V}\) times.

Proof. The proof is left for the reader.

The multiplicity of an eigenvalue \(\lambda\) of \(T\) is defined to be the dimension of the subspace of generalized eigenvectors corresponding to \(\lambda\), that is the multiplicity of \(\lambda\) is equal to \(\text{dim} \text{ker}(T-\lambda I)^{\text{dim} V}\).

Theorem 7.3 If \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\), then the sum of the multiplicities of all the eigenvalues of \(T\) equals \(\text{dim} V\).

Proof. The proof is left for the reader.

Let \(d_j\) denote the multiplicity of \(\lambda_j\) as an eigenvalue of \(T\), the polynomial \[ (z-\lambda_1)^{d_1} \cdots (z-\lambda_m)^{d_m} \] is called the characteristic polynomial of \(T\).

Theorem 7.4 Suppose that \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Let \(q\) denote the characteristic polynomial of \(T\). Then \(q(T)=0\).

Proof. The proof is left for the reader.

Theorem 7.5 If \(T\in \mathcal{L}(V)\) and \(p\in \mathcal{P}(\mathbb{F})\), then \(\text{ker} p(T)\) is invariant under \(T\).

Proof. The proof is left for the reader.

Theorem 7.6 Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Let \(\lambda_1,\ldots,\lambda_m\) be the distinct eigenvalues of \(T\), and let \(U_1,\ldots,U_m\) be the corresponding subspaces of generalized eigenvectors. Then

  • \(V=U_1\oplus \cdots \oplus U_m\);
  • each \(U_J\) is invariant under \(T\);
  • each \(\left.(T-\lambda_j I) \right|_{U_j}\) is nilpotent.

Proof. The proof is left for the reader.

Theorem 7.7 Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Then there is a basis of \(V\) consisting of generalized eigenvectors of \(T\).

Proof. The proof is left for the reader.

Theorem 7.8 Suppose \(N\) is a nilpotent operator on \(V\). Then there is a basis of \(V\) with respect to which the matrix of \(N\) has the form \[ \begin{bmatrix} 0 & & * \\ & \ddots & \\ 0 & & 0\end{bmatrix}; \] here all entries on and below the diagonal are 0’s.

Proof. The proof is left for the reader.

Theorem 7.9 Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Let \(\lambda_1,\ldots,\lambda_m\) be the distinct eigenvalues of \(T\). Then there is a basis of \(V\) with respect to which \(T\) has block diagonal matrix of the form \[ \begin{bmatrix} A_1 & & 0\\ & \ddots & \\ 0 & & A_m\end{bmatrix}, \] where each \(A_j\) is an upper-triangular matrix of the form \[ \begin{bmatrix} \lambda_1 & & * \\ & \ddots & \\ 0 & & \lambda_j\end{bmatrix}. \]

Proof. The proof is left for the reader.

Theorem 7.10 Suppose \(N\in\mathcal{L}(V)\) is nilpotent. Then \(I+N\) has a square root.

Proof. The proof is left for the reader.

On real vector spaces there exist invertible operators that have no square roots. For example, the operator of multplication by \(-1\) on \(\mathbb{R}\) has no square root because no real number has its square equal to \(-1\).

Theorem 7.11 Suppose \(V\) is a complex vector space. If \(T\in \mathcal{L}(V)\) is invertible, then \(T\) has a square root.

Proof. The proof is left for the reader.

The minimal polynomial of \(T\) is the monic polynomial \(p\in \mathcal{P}(\mathbb{F}\) of smallest degree such that \(p(T)=0\).

Theorem 7.12 Let \(T\in \mathcal{L}(V)\) and let \(q\in \mathcal{P}(\mathbb{F})\). Then \(q(T)=0\) if and only if the minimal polynomial of \(T\) divided \(q\).

Proof. The proof is left for the reader.

Theorem 7.13 Let \(T\in \mathcal{L}(V)\). Then the roots of the minimal polynomial of \(T\) are precisely the eigenvalues of \(T\).

Proof. The proof is left for the reader.

Every \(T\in \mathcal{L}(V)\) where \(V\) is a complex vector space, there is a basis of \(V\) with respect to which \(T\) has a nice upper-triangular matrix. We can do even better. There is a basis of \(V\) with respect to which the matrix of \(T\) contains zeros everywhere except possibly on the diagonal and the line directly above the diagonal.

Suppose \(N\in \mathcal{L}(V)\) is nilpotent. For each nonzero vector \(v\in V\), let \(m(v)\) denote the largest nonnegative integer such that \(N^{m(v)}\neq 0\).

Theorem 7.14 If \(N\in \mathcal{L}(V)\) is nilpotent, then there exist vectors \(v_1,\ldots,v_k \in V\) such that

  • \(\left(v_1,N v_1,\ldots,N^{m(v_1)},\ldots,v_k, N v_k, \ldots, N^{m(v_k)}v_k\right)\) is a basis of \(V\);
  • \(\left(N^{m(v_1)}v_1,\ldots,N^{m(v_k)}v_k\right)\) is a basis of \(\text{ker} N\).

Proof. The proof is left for the reader.

Example 7.1 Suppose \(T\in \mathcal{L}(V)\). Prove that if \(U_1,\ldots,U_m\) are subspaces of \(V\) invariant under \(T\), then \(U_1 + \cdots +U_m\) is invariant under \(T\). Suppose \(v\in U_1+\cdots + U_m\). Then there exists \(u_1,\ldots,u_m\) such that \(v=u_1+\cdots +u_m\) with \(u_j\in U_j\). Then \(Tv=T u_1+\cdots + T u_m\). Since each \(U_j\) is invariant under \(T\), \(T u_j \in U_j\), so \(T v \in U_1+ \cdots + U_m\).

Example 7.2 Suppose \(T\in \mathcal{L}(V)\). Prove that the intersection of any collection of subspaces of \(V\) invariant under \(T\) is invariant under \(T\). Suppose we have subspaces \(\{U_j\}\) with each \(U_j\) invariant under \(T\). Let \(v\in \cap_j U_j\). Then \(Tv\in U_j\) for each \(j\), and so \(\cap_j U_j\) is invariant under \(T\).

Example 7.3 Prove or give a counterexample: if \(U\) is a subspace of \(V\) that is invariant under every operator on \(V\), then \(U=\{0\}\) or \(U=V\). We will prove the contrapositive: if \(U\) is a subspace of \(V\) and \(U\neq \{0\}\) and \(U\neq V\), then there exists an operator \(T\) on \(V\) such that \(U\) is not invariant under \(T\). Let \((u_1,\ldots,u_m)\) be a basis for \(U\), which we extend to a basis \((u_1,\ldots,u_m, v_1,\ldots,v_n)\) of \(V\). The assumption \(U\neq \{0\}\) and \(U\neq V\) means that \(m\geq 1\) and \(n\geq 1\). Define a linear map \(T\) by \(Tu_1=v_1\) and for \(j>1\), \(T u_j=0\). Since \(v_1\not \in U\), the subspace \(U\) is not invariant under the operator \(T\).

Example 7.4 Suppose that \(S,T\in \mathcal{L}(V)\) are such that \(S T= T S\). Prove that \(\text{null }(T-\lambda I)\) is invariant under \(S\) for every \(\lambda \in F\). Suppose \(v\in \text{ker}(T-\lambda I)\). Then \(Tv = \lambda v\) and using \(TS=ST\), \(Sv\) satisfies $ T(S v)=S(T v)=S v)=(S v). $ Thus \(S v\in \text{ker}(T-\lambda I)\) and so \(\text{ker} (T-\lambda I)\) is invariant under \(S\).

Example 7.5 Define \(T\in \mathcal{L}(F^2)\) by \(T(w,z)=(z,w)\). Find all eigenvalues and eigenvectors of \(T\). Suppose \((w,z)\neq (0,0)\) and \(T(w,z)=(z,w)=\lambda(w,z)\). Then \(z=\lambda w\) and \(w=\lambda z\). Of course this leads to \(w=\lambda z=\lambda^2w\), \(z=\lambda w=\lambda^2 z\). Since \(w\neq 0\) or \(z\neq 0\), we see that \(\lambda^2=1\) so that \(\lambda =\pm 1\). A basis of eigenvectors is \((w_1,z_1)=(1,1)\), \((w_2,z_2)=(-1,1)\) and they have eigenvalues 1 and \(-1\) respectively.

Example 7.6 Define \(T\in \mathcal{L}(F^3)\) by \(T(z_1,z_2,z_3)=(2z_2,0,5z_3)\). Find all eigenvalues and eigenvectors of \(T\). Suppose \((z_1,z_2,z_3)\neq (0,0,0)\) and \[ T(z_1,z_2,z_3)=(2z_2,0,5z_3)=\lambda (z_1,z_2,z_3). \] If \(\lambda=0\) then \(z_2=z_3=0\), and one checks that \(v_1=(0,0,0)\) is an eigenvector with eigenvalue 0. If \(\lambda\neq 0\) then \(z_2=0\), \(2z_2=\lambda z_1=0\), \(5z_3=\lambda z_3\), so \(z_1=0\) and \(\lambda =5\). The eigenvector for \(\lambda=5\) is \(v_2=(0,0,1)\). These are the only eigenvalues and each eigenspace is one dimensional.

Example 7.7 Suppose \(n\) is a positive integer and \(T\in \mathcal{L}(\mathbb{F}^n)\) is defined by \[ T(x_1,\ldots,x_n)=(x_1+ \cdots + x_n,\ldots,x_1+\cdots +x_n). \] Find all eigenvalues and eigenvectors of \(T\). First, any vector of the form \(v_1=(\alpha,\ldots,\alpha)\), for \(\alpha\in \mathbb{F}\), is an eigenvector with eigenvalue \(n\). If \(v_2\) is any vector $v_2=(x_1,,x_n), $ such that \(x_1+\cdots + x_n=0\) then \(v_2\) is an eigenvector with eigenvalue 0. Here are the independent eigenvectors: \(v_1=(1,1,\ldots,1)\), and \(v_n=(1,0,\ldots,0)-E_n\), for \(n\geq 2\) where \(E_n\) denoted the \(n\)-th standard basis vector.

Example 7.8 Suppose \(T\in \mathcal{L}(V)\) is invertible and \(0\neq \lambda \in F\). Prove that \(\lambda\) is an eigenvalue of \(T\) if and only if \(\frac{1}{\lambda}\) is an eigenvalue of \(T^{-1}\). Suppose \(v\neq 0\) and \(T v =\lambda v\). Then \(v=T^{-1}T v=\lambda T^{-1}v\), or \(T^{-1}v=\frac{1}{\lambda}v\), and the other direction is similar.

Example 7.9 Suppose \(S,T\in \mathcal{L}(V)\). Prove that \(S T\) and \(T S\) have the same eigenvalues. Suppose \(v\neq 0\) and \(STv=\lambda v\). Multiply by \(T\) to get \(T S(Tv)=\lambda T v\). Thus if \(t\neq 0\) then \(\lambda\) is also an eigenvalue of \(TS\), with nonzero=o eigenvector \(Tv\). On the other hand, if \(T v=0\), then \(\lambda =0\) is an eigenvalue of \(ST\). But if \(T\) is not invertible, then \(\text{im} TS \subset\text{im} T\)is not equal to \(V\), so \(TS\) has a nontrivial null space, hence 0 is an eigenvalue of \(TS\).

Example 7.10 Suppose \(T\in \mathcal{L}(V)\) is such that every vector in \(V\) is an eigenvector of \(T\). Prove that \(T\) is a scalar multiple of the identity operator. Pick a basis \((v_1,\ldots,v_N)\) for \(V\). By assumption, \(T v_n=\lambda_n v_n\). Pick any two distance indices, \(m,n\). We also have \(T(v_m+v_n)=\lambda(v_m+v_n)=\lambda(v_m+v_n)=\lambda_m v_m + \lambda_n v_n\). Write this as \(0=(\lambda-\lambda_m)v_m+(\lambda-\lambda_n)v_n\). Since \(v_m\) and \(v_n\) are independent, \(\lambda=\lambda_m=\lambda_n\), and all the \(\lambda_n\) are equal.

Example 7.11 Suppose \(S, T\in \mathcal{L}(V)\) and \(S\) is invertible. Prove that if \(p \in \mathcal{P}(\mathbb{F})\) is a polynomial, then \(p(S T S^{-1})=S p(T) S^{-1}\). First let’s show that for positive integers \(n\), \((STS^{-1})^n=S T^n S^{-1}\). We may do this by induction, with nothing to show if \(n=1\). Assume it’s true for \(n=k\), and consider \[ (STS^{-1})^{k+1}=(STS^{-1})^k(STS^{-1})=ST^k S^{-1}STS^{-1}=S T^{k+1}S^{-1}. \] Now suppose \(P(z)=a_n z^N+\cdots + a_1 z+a_0.\) Then \[\begin{align*} p(STS^{-1})&=\sum_{n=0}^N a_n (STS^{-1})^n=\sum_{n=0}^N a_n ST^n S^{-1}\\ & =S\left( \sum_{n=0}^N a_n T^n \right) S^{-1}=Sp(T)S^{-1}. \tag*{} \end{align*}\]

Example 7.12 Suppose \(F=\mathbb{C}\), \(T\in \mathcal{L}(V)\), \(p\in \mathcal{P}(\mathbb{C})\), and \(a\in \mathbb{C}\). Prove that \(a\) is an eigenvalue of \(p(T)\) if and only if \(a=p(\lambda)\) for some eigenvalue $$ of \(T\). Suppose first that \(v\neq 0\) is an eigenvalue of \(T\) with eigenvalue \(\lambda\); that is \(T v = \lambda v\). Then for positive integers \(n\), \(T^n v=T^{n-1} \lambda v = \cdots \lambda^n v\), and so \(p(T)v=p(\lambda) v\). That is \(\alpha=p(\lambda)\) is an eigenvalue of \(p(T)\) if \(\lambda\) is an eigenvalue of \(T\). Conversely, suppose now that \(\alpha\) is a eigenvalue of \(p(T)\), so there is a \(v\neq 0\) with \(p(T)v =\alpha v\), or \((p(T)-\alpha I)v=0\). Since \(\mathbb{F}=\mathbb{C}\), we may factor the polynomial $p(T)-I $ into linear factors \[ 0=(p(T)-\alpha I)v=\prod (T-\lambda_n I)v. \] At least one of the factors is not invertible, so at least one of the \(\lambda_n\), say \(\lambda_1\), is an eigenvalue of \(T\). Let \(w\neq 0\) be an eigenvector for \(T\) with eigenvalue \(\lambda_1\). Then \[ 0=(T-\lambda_N I)\cdots (T-\lambda_1 I)w=(p(T)-\alpha I) w, \] so \(w\) is an eigenvalue for \(p(T)\) with eigenvalue \(\alpha\). But by the first part of the argument, \(p(T)w=p(\lambda_1)w=\alpha w\) and \(\alpha=p(\lambda_1)\).

Example 7.13 Show that the previous exercise does not hold with \(F=\mathbb{R}\). Take \(T: \mathbb{R}^2 \rightarrow \mathbb{R}^2\) given by \(T(x,y)=(-y,x)\). We’ve seen perviously that \(T\) has no real eigenvalues. On the other hand, \(T^2(x,y)=(-x,-y)=-1(x,y)\).

Example 7.14 Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Prove that \(T\) has an invariant subspace of dimension \(j\) for each \(j=1,\ldots, \text{dim} V\). Let \((v_1,\ldots,v_N)\) be a basis with respect to which \(T\) has an upper triangular matrix. Then by a previous proposition, \(T: \text{span}(v_1,\ldots,v_j) \rightarrow \text{span}(v_1,\ldots,v_j)\).

Example 7.15 Give an example of an operator whose matrix with respect to some basis contains only 0’s on the diagonal, but the operator is invertible. Consider \(T=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\).

Example 7.16 Give an example of an operator whose matrix with respect to some basis contains only nonzero numbers on the diagonal, but the operator is not invertible. Taking \(T=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\). If \(v=[1,-1]\), then \(Tv=0\).

Example 7.17 Give an example of an operator on \(\mathbb{C}^4\) whose characteristic and minimal polynomials both equal \(z (z-1)^2(z-3)\).

Example 7.18 Give an example of an operator on \(\mathbb{C}^4\) whose characteristic polynomial equals \(z (z-1)^2(z-3)\) and whose minimal polynomial equals \(z (z-1)(z-3)\).

Example 7.19 Suppose \(a_0, \ldots, a_{n-1}\in \mathbb{C}\). Find the minimal and characteristic polynomials of the operator on \(\mathbb{C}^n\) whose matrix is \[ \begin{bmatrix} 0 & & & & & -a_0 \\ 1 & 0 & & & & -a_1 \\ & 1 & \ddots & & & -a_2 \\ & & \ddots & & & \vdots \\ & 1 & & & 0 & -a_{n-2} \\ & & & & 1 & -a_{n-1} \end{bmatrix} \] with respect to the standard bases.

7.2 Jordan Canonical Form

A basis of \(V\) is called a Jordan basis for \(T\) if with respect to this basis \(T\) has block diagonal matrix \[ \begin{bmatrix} A_1 & & 0 \\ & \ddots & \\ 0 & & A_m \\ \end{bmatrix} \] where each \(A_j\) is an upper triangular matrix of the form \[ A_j = \begin{bmatrix} \lambda_j & 1 & & 0 \\ & \ddots & \ddots & \\ & & \ddots & 1 \\ 0 & & & \lambda_j \\ \end{bmatrix} \] where the diagonal is filled with some eigenvalue \(\lambda_j\) of \(T\).

Because there exist operators on real vector spaces that have no eigenvalues, there exist operators on real vector spaces for which there is no corresponding Jordan basis.

Theorem 7.15 Suppose \(V\) is a complex vector space. If \(T\in \mathcal{L}(V)\), then there is a basis of \(V\) that is a Jordan basis for \(T\).

Proof. The proof is left for the reader.

A basis of \(V\) is called a Jordan basis for \(T\) if with respect to this basis \(T\) has block diagonal matrix \[ \begin{bmatrix} A_1 & & 0 \\ & \ddots & \\ 0 & & A_m \\ \end{bmatrix} \] where each \(A_j\) is an upper triangular matrix of the form \[ A_j = \begin{bmatrix} \lambda_j & 1 & & 0 \\ & \ddots & \ddots & \\ & & \ddots & 1 \\ 0 & & & \lambda_j \\ \end{bmatrix} \] where the diagonal is filled with some eigenvalue \(\lambda_j\) of \(T\).

An operator \(T\) can be put into Jordan canonical form if its characteristic and minimal polynomials factor into linear polynomials. this is always true if the vector space is complex.

Theorem 7.16 Let \(T\in\mathcal{L}(V)\) whose characteristic and minimal polynomials are, respectively, \[ c(t)=(t-\lambda_1)^{n_1} \cdots (t-\lambda_r)^{n_r}) \qquad \text{and} \qquad m(t)=(t-\lambda_1)^{m_1} \cdots (t-\lambda_r)^{m_r}) \] where the \(\lambda_i\) are distinct scalars. Then \(T\) has block diagonal matrix representation \(J\) whose diagonal entries are of the form \[ J_{ij}= \begin{bmatrix} \lambda_i & 1 & 0 & \cdots & 0 & 0 \\ 0 & \lambda_i & 1 & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & \lambda_i & 1 \\ 0 & 0 & 0 & \cdots & 0 & \lambda_i \end{bmatrix} \] For each \(\lambda_i\) the corresponding blocks have the following properties:

  • There is at least one \(J_{ij}\) of order \(m_i\); all other \(J_{ij}\) are of order \(\leq m_i\)
  • The sum of the orders of the \(J_{ij}\) is \(n_i\).
  • The number of \(J_{ij}\) equals the geometric multiplicity of \(\lambda_i\).
  • The number of \(J_{ij}\) of each possible order is uniquely determined by \(T\).

Proof. The proof is left for the reader.

The matrix \(J\) in the above proposition is called the Jordan canonical form of the operator \(T\). A diagonal block \(J_{ij}\) is called a Jordan block belongng to the eigenvalue \(\lambda_i\). Observe that \[ \begin{bmatrix} \lambda_i & 1 & 0 & \cdots & 0 & 0 \\ 0 & \lambda_i & 1 & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & \lambda_i & 1 \\ 0 & 0 & 0 & \cdots & 0 & \lambda_i \end{bmatrix} = \begin{bmatrix} \lambda_i & 0 & 0 & \cdots & 0 & 0 \\ 0 & \lambda_i & 0 & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & \lambda_i & 0 \\ 0 & 0 & 0 & \cdots & 0 & \lambda_i \end{bmatrix} + \begin{bmatrix} 0 & 1 & 0 & \cdots & 0 & 0 \\ 0 & 0 & 1 & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 0 & 1 \\ 0 & 0 & 0 & \cdots & 0 & 0 \end{bmatrix} \] That is, \(J_{ij}=\lambda_i I+N\) where \(N\) is the nilpotent block.

Example 7.20 Suppose the characteristic and minimum polynomials of an operator \(T\) are, respectively, \[ c(t)=(t-2)^4(t-3)^3 \qquad \text{and} \qquad m(t)=(t-2)^2(t-3)^2. \] Then the Jordan canonical form of \(T\) is one of the following matrices: \[ \begin{bmatrix} 2 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 2 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 3 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 3 \end{bmatrix} \qquad \text{or} \qquad \begin{bmatrix} 2 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 2 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 3 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 3 \end{bmatrix} \] The first matrix occurs if \(T\) has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix occurs if \(T\) has three independent eigenvectors belonging to 2.

Example 7.21 Suppose \(N\in \mathcal{L}(V)\) is nilpotent. Prove that the minimal polynomial of \(N\) is \(z^{m+1}\), where \(m\) is the length of the longest consecutive string of \(1'\text{s}\) that appears on the line directly above the diagonal in the matrix of \(N\) with respect to any Jordan basis for \(N\).

Example 7.22 Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Prove that there does not exist a direct sum decomposition of \(V\) into two proper subspaces invariant under \(T\) if and only if the minimal polynomial of \(T\) is of the form \((z-\lambda)^{\text{dim} V}\) for some \(\lambda \in \mathbb{C}\).

Example 7.23 Suppose \(T\in \mathcal{L}(V)\) and \((v_1,\ldots,v_n)\) is a basis of \(V\) that is a Jordan basis for \(T\). Describe the matrix of \(T\) with respect to the basis \((v_n,\ldots,v_1)\) obtained by reversing the order of the \(v\)’s.

Example 7.24 Consider a 2-by-2 matrix of real numbers \[ \begin{bmatrix} a & c \\ b & d \end{bmatrix}. \]

Example 7.25 Suppose \(A\) is a block diagonal matrix \[ A=\begin{bmatrix} A_1 & & 0 \\ & \ddots & \\ 0 & & A_m \end{bmatrix}, \] where each \(A_j\) is a square matrix. Prove that the set of eigenvalues of \(A\) equals the union of the eigenvalues of \(A_1,\ldots,A_m\).

Example 7.26 Suppose \(A\) is a block upper-triangular matrix \[ A= \begin{bmatrix} A_1 & & * \\ & \ddots & \\ 0 & & A_m \end{bmatrix}, \] where each \(A_j\) is a square matrix. Prove that the set of eigenvalues of \(A\) equals the union of the eigenvalues of \(A_1\),,\(A_m\).

Example 7.27 Suppose \(V\) is a real vector space and \(T\in \mathcal{L}(V)\). Suppose \(\alpha, \beta \in \mathbb{R}\) are such that \(T^2+\alpha T+\beta I=0\). Prove that \(T\) has an eigenvalue if and only if \(\alpha^2 \geq 4 \beta\).

Example 7.28 Suppose \(V\) is a real inner-product space and \(T\in \mathcal{L}(V)\). Prove that there is an orthonormal basis of \(V\) with respect to which \(T\) has a block upper-triangular matrix \[ \begin{bmatrix} A_1 & & * \\ & \ddots & \\ 0 & & A_m \end{bmatrix}. \] where each \(A_j\) is a 1-by-1 matrix or a 2-by-2 matrix with no eigenvalues.

Example 7.29 Prove that if \(T\in \mathcal{L}(V)\) and \(j\) is a positive integer such that \(j \leq \text{dim} V\), then \(T\) has an invariant subspace whose dimension equals \(j-1\) or \(j\).

Example 7.30 Prove that there does not exist an operator \(T\in \mathcal{L}(\mathbb{R}^7)\) such that \(T^2+T+I\) is nilpotent.

Example 7.31 Give an example of an operator \(T\in \mathcal{L}(\mathbb{C}^7)\) such that \(T^2+T+I\) is nilpotent.

Example 7.32 Suppose \(V\) is a real vector space and \(T\in \mathcal{L}(V)\). Suppose \(\alpha, \beta \in \mathbb{R}\) are such that \(\alpha^2< 4\beta\). Prove that null \((T^2+\alpha T + \beta I)^k\) has even dimension for every positive integer \(k\).

Example 7.33 Suppose \(V\) is a real vector space and \(T\in \mathcal{L}(V)\). Suppose \(\alpha, \beta \in \mathbb{R}\) are such that \(\alpha^2< 4\beta\) and \(T^2+\alpha T+\beta I\) is nilpotent. Prove that \(\text{dim} V\) is even and \((T^2+\alpha T+\beta I)^{\text{dim} V/2}=0.\)

Example 7.34 Prove that if \(T\in \mathcal{L}(\mathbb{R}^3)\) and 5, 7 are eigenvalues of \(T\), then \(T\) has no eigenpairs.

Example 7.35 Suppose \(V\) is a real vector space with $ V =n $ and \(T\in \mathcal{L}(V)\) is such that null $T^{n-2}$ null \(T^{n-1}\). Prove that if \(T\) has at most two distinct eigenvalues and that \(T\) has no eigenpairs.

Example 7.36 Prove that 1 is an eigenvalue of every square matrix with the property that the sum of the entries in each row equals 1.

Example 7.37 Suppose \(V\) is a real vector space with $ V =2 $. Prove that if \[ \begin{bmatrix} a & c \\ b & d \end{bmatrix} \] is the matrix of \(T\) with respect to some basis of \(V\), then the characteristic polynomial of \(T\) equals \((z-a)(z-d)-b c\).

Example 7.38 Suppose \(V\) is a real inner-product space and \(S\in \mathcal{L}(V)\) is an isometry. Prove that if \((\alpha, \beta)\) is an eigenpair of \(S\), then \(\beta=1\).