# 7 Canonical Forms

\(\newcommand{\vlist}[2]{#1_1,#1_2,\ldots,#1_#2}\) \(\newcommand{\vectortwo}[2]{\begin{bmatrix} #1 \\ #2\end{bmatrix}}\) \(\newcommand{\vectorthree}[3]{\begin{bmatrix} #1 \\ #2 \\ #3\end{bmatrix}}\) \(\newcommand{\vectorfour}[4]{\begin{bmatrix} #1 \\ #2 \\ #3 \\ #4\end{bmatrix}}\) \(\newcommand{\vectorfive}[5]{\begin{bmatrix} #1 \\ #2 \\ #3 \\ #4 \\ #5 \end{bmatrix}}\) \(\newcommand{\lincomb}[3]{#1_1 \vec{#2}_1+#1_2 \vec{#2}_2+\cdots + #1_m \vec{#2}_#3}\) \(\newcommand{\norm}[1]{\left|\left |#1\right|\right |}\) \(\newcommand{\ip}[1]{\left \langle #1\right \rangle}\) \(\newcommand{\plim}[2]{\lim_{\footnotesize\begin{array}{c} \\[-10pt] #1 \\[0pt] #2 \end{array}}}\)

## 7.1 Invariant Subspaces

In this section we let \(V\) and \(W\) denote real or complex vector spaces.

Suppose \(T \in \mathcal{L}(V)\) and \(U\) a subspace of \(V\), then we say \(U\) is an **invariant subspace** of \(T\) if \(u\in U\) implies \(Tu\in U\).

**Lemma 7.1 **Suppose \(T \in \mathcal{L}(V)\) and \(U\) a subspace of \(V\). Then all of the following hold

- \(U\) is invariant under \(T\) if and only if \(T|_U\) is an operator on \(U\),
- \(\text{ker} T\) is invariant under \(T\), and
- \(\text{im} T\) is invariant under \(T\).

*Proof*. The proof of each part follows.

- By definition, invariant subspace \(U\) is invariant under \(T\) if and only if \(u\in U \implies Tu\in U\), which, by definition of operator, is the same as \(T|_U\) being an operator.
- If $u T $ then \(Tu=0\), and hence \(Tu\in \text{ker} T\).
- If $u T $, then by the definition of range \(Tu\in\text{im} T\).

**Theorem 7.1 **Suppose \(T\in \mathcal{L}(V)\). Let \(\lambda_1,\ldots,\lambda_m\) denote distinct eigenvalues of \(T\). The the following are equivalent:

- \(T\) has a diagonal matrix with respect to some basis of \(V\);
- \(V\) has a basis consisting of eigenvectors of \(T\);
- there exist one-dimensional \(T\)-invariant subspaces \(U_1, \ldots, U_n\) of \(V\) such that \(V=U_1 \oplus \cdots \oplus U_n\);
- \(V=\null (T-\lambda_1 I) \oplus \cdots \oplus \null (T-\lambda_m I)\);
- \(\text{dim} V = \text{dim} \null (T-\lambda_1 I) + \cdots + \text{dim} \null (T-\lambda_m I)\).

*Proof*.\(\Longleftrightarrow\) (ii): Exercise.

\(\Longleftrightarrow\) (iii): Suppose (ii) holds; thus suppose \(V\) has a basis consisting of eigenvectors of \(T\). For each \(j\), let \(U_j=\text{span}(v_j)\). Obviously each \(U_j\) is a one-dimensional subspace of \(V\) that is invariant under \(T\) (because each \(v_j\) is an eigenvector of \(T\)). Because \((v_1,\ldots,v_n)\) is a basis of \(V\), each vector in \(V\) can be written uniquely as a linear combination of \((v_1,\ldots,v_n)\). In other words, each vector in \(V\) can be written uniquely as a sum \(u_1+\cdots +u_n\), where each \(u_j\in U_j\). Thus \(V=U_1 \oplus \cdots \oplus U_n\). Hence (ii) implies (iii). Conversely, suppose now that (iii) holds; thus there are one-dimensional subspaces \(U_1,\ldots,U_n\) of \(V\), each invariant under \(T\), such that \(V=U_1 \oplus \cdots \oplus U_n\). For each \(j\), let \(v_j\) be a nonzero vector in \(U_j\). Then each \(v_j\) is an eigenvector of \(T\). Because each vector in \(V\) can be written uniquely as a sum \(u_1+\cdots + u_n\), where each \(u_j\in U_j\) ( so each \(u_j\) is a scalar multiple of \(v_j\)), we see that \((v_1,\ldots,v_n)\) is a basis of \(V\). Thus (iii) implies (ii).

\(\Longrightarrow\) (iv): Suppose (ii) holds; thus thus suppose \(V\) has a basis consisting of eigenvectors of \(T\). Thus every vector in \(V\) is a linear combination of eigenvectors of \(T\). Hence \(V=\null (T-\lambda_1)I + \cdots + \null (T-\lambda_m)I.\) To show that the sum above is direct, suppose that \(0=u_1+\cdots + u_m\), where each \(u_j\in \null (T-\lambda_j I)\). Because nonzero eigenvectors correspond to distinct eigenvalues are linearly independent, this implies that each \(u_j\) equals 0. This implies that the sum is a direct sum, completing the proof that

implies (iii).

\(\Longrightarrow\) (v): Exercise.

\(\Longrightarrow\) (ii): Suppose (v) holds; thus \(\text{dim} V=\text{dim} \null (T-\lambda_1I)+\cdots + \text{dim} \null (T-\lambda_m I)\). Choose a basis of each \(\null(T-\lambda_j I)\); put all these bases together to form a list \((v_1,\ldots,v_n)\) of eigenvectors of \(T\), where \(n=\text{dim} V\). To show that this list is linearly independent, suppose \(a_1 v_1+\cdots + a_n v_n=0\), where \(a_1,\ldots,a_n\) are scalars. For each \(j=1, \ldots., m\), let \(u_j\) denote the sum of all the terms \(a_k v_k\) such that \(v_k\in \null(T-\lambda_j I)\). Thus each \(u_j\) is an eigenvector of \(T\) with eigenvalue \(\lambda_j\), and \(u_1+\cdots + u_m=0\). Because nonzero eigenvectors corresponding to distinct eigenvalues are linearly independent, this implies that each \(u_j\) equals to 0. Because each \(u_j\) is a sum of terms \(a_k v_k\) where the \(v_k\)’s where chosen to be a basis of \(\null (T-\lambda_j I)\), this implies that all the \(a_k\)’s equal to 0. Thus \((v_1,\ldots,v_n)\) is linearly independent and hence a basis of \(V\). Thus (v) implies (ii).

A vector \(v\) is called a **generalized eigenvector** of \(T\) corresponding to \(\lambda\), where \(\lambda\) is an eigenvalue of \(T\), if \((T-\lambda I)^j v=0\) for some positive integer \(j\).

**Lemma 7.2 **If \(T\in \mathcal{L}(V)\). Then, if \(m\) is a nonnegative integer such that \(\text{ker} T^m=\text{ker} T^{m+1}\), then

- \(\{0\}=\text{ker} T^0 \subseteq \text{ker} T^1 \subseteq \cdots \subseteq \text{ker} T^m = \text{ker} T^{m+1} = \text{ker} T^{m+2} = \cdots\)
- \(\text{ker} T^{\text{dim} V}=\text{ker} T^{\text{dim} V+1} =\text{ker} T^{\text{dim} V+2}\cdots\)
- \(V=\text{im} T^0 \supseteq \text{im} T^1 \supseteq \cdots \supseteq \text{im} T^k \supseteq \text{im} T^{k+1} \supseteq \cdots\)
- \(\text{im} T^{\text{dim} V}= \text{im} T^{\text{dim} V+1} = \text{im} T^{\text{dim} V+2}\cdots\)

**Lemma 7.3 **Suppose \(T\in \mathcal{L}(V)\) and \(\lambda\) is an eigenvalue of \(T\). Then the set of generalized eigenvalues of \(T\) corresponding to \(\lambda\) equals \(\text{ker}(T-\lambda I)^{\text{dim} V}\).

*Proof*. The proof is left for the reader.

An operator is called **nilpotent** if some power of it equal 0.

**Lemma 7.4 **Suppose \(N\in \mathcal{L}(V)\) is nilpotent, then \(N^{\text{dim} V}=0\).

*Proof*. The proof is left for the reader.

**Theorem 7.2 **Let \(T\in \mathcal{L}(V)\) and \(\lambda\in\mathbb{F}\). Then for every basis of \(V\) with respect to which \(T\) has an upper-triangular matrix, \(\lambda\) appears on the diagonal of the matrix of \(T\) precisely \(\text{dim} (T-\lambda I)^{\text{dim} V}\) times.

*Proof*. The proof is left for the reader.

The **multiplicity** of an eigenvalue \(\lambda\) of \(T\) is defined to be the dimension of the subspace of generalized eigenvectors corresponding to \(\lambda\), that is the multiplicity of \(\lambda\) is equal to \(\text{dim} \text{ker}(T-\lambda I)^{\text{dim} V}\).

**Theorem 7.3 **If \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\), then the sum of the multiplicities of all the eigenvalues of \(T\) equals \(\text{dim} V\).

*Proof*. The proof is left for the reader.

Let \(d_j\) denote the multiplicity of \(\lambda_j\) as an eigenvalue of \(T\), the polynomial \[
(z-\lambda_1)^{d_1} \cdots (z-\lambda_m)^{d_m}
\] is called the **characteristic polynomial** of \(T\).

**Theorem 7.4 **Suppose that \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Let \(q\) denote the characteristic polynomial of \(T\). Then \(q(T)=0\).

*Proof*. The proof is left for the reader.

**Theorem 7.5 **If \(T\in \mathcal{L}(V)\) and \(p\in \mathcal{P}(\mathbb{F})\), then \(\text{ker} p(T)\) is invariant under \(T\).

*Proof*. The proof is left for the reader.

**Theorem 7.6 **Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Let \(\lambda_1,\ldots,\lambda_m\) be the distinct eigenvalues of \(T\), and let \(U_1,\ldots,U_m\) be the corresponding subspaces of generalized eigenvectors. Then

- \(V=U_1\oplus \cdots \oplus U_m\);
- each \(U_J\) is invariant under \(T\);
- each \(\left.(T-\lambda_j I) \right|_{U_j}\) is nilpotent.

*Proof*. The proof is left for the reader.

**Theorem 7.7 **Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Then there is a basis of \(V\) consisting of generalized eigenvectors of \(T\).

*Proof*. The proof is left for the reader.

**Theorem 7.8 **Suppose \(N\) is a nilpotent operator on \(V\). Then there is a basis of \(V\) with respect to which the matrix of \(N\) has the form \[
\begin{bmatrix} 0 & & * \\ & \ddots & \\ 0 & & 0\end{bmatrix};
\] here all entries on and below the diagonal are 0’s.

*Proof*. The proof is left for the reader.

**Theorem 7.9 **Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Let \(\lambda_1,\ldots,\lambda_m\) be the distinct eigenvalues of \(T\). Then there is a basis of \(V\) with respect to which \(T\) has block diagonal matrix of the form \[
\begin{bmatrix} A_1 & & 0\\ & \ddots & \\ 0 & & A_m\end{bmatrix},
\] where each \(A_j\) is an upper-triangular matrix of the form \[
\begin{bmatrix} \lambda_1 & & * \\ & \ddots & \\ 0 & & \lambda_j\end{bmatrix}.
\]

*Proof*. The proof is left for the reader.

**Theorem 7.10 **Suppose \(N\in\mathcal{L}(V)\) is nilpotent. Then \(I+N\) has a square root.

*Proof*. The proof is left for the reader.

On real vector spaces there exist invertible operators that have no square roots. For example, the operator of multplication by \(-1\) on \(\mathbb{R}\) has no square root because no real number has its square equal to \(-1\).

**Theorem 7.11 **Suppose \(V\) is a complex vector space. If \(T\in \mathcal{L}(V)\) is invertible, then \(T\) has a square root.

*Proof*. The proof is left for the reader.

The **minimal polynomial** of \(T\) is the monic polynomial \(p\in \mathcal{P}(\mathbb{F}\) of smallest degree such that \(p(T)=0\).

**Theorem 7.12 **Let \(T\in \mathcal{L}(V)\) and let \(q\in \mathcal{P}(\mathbb{F})\). Then \(q(T)=0\) if and only if the minimal polynomial of \(T\) divided \(q\).

*Proof*. The proof is left for the reader.

**Theorem 7.13 **Let \(T\in \mathcal{L}(V)\). Then the roots of the minimal polynomial of \(T\) are precisely the eigenvalues of \(T\).

*Proof*. The proof is left for the reader.

Every \(T\in \mathcal{L}(V)\) where \(V\) is a complex vector space, there is a basis of \(V\) with respect to which \(T\) has a nice upper-triangular matrix. We can do even better. There is a basis of \(V\) with respect to which the matrix of \(T\) contains zeros everywhere except possibly on the diagonal and the line directly above the diagonal.

Suppose \(N\in \mathcal{L}(V)\) is nilpotent. For each nonzero vector \(v\in V\), let \(m(v)\) denote the largest nonnegative integer such that \(N^{m(v)}\neq 0\).

**Theorem 7.14 **If \(N\in \mathcal{L}(V)\) is nilpotent, then there exist vectors \(v_1,\ldots,v_k \in V\) such that

- \(\left(v_1,N v_1,\ldots,N^{m(v_1)},\ldots,v_k, N v_k, \ldots, N^{m(v_k)}v_k\right)\) is a basis of \(V\);
- \(\left(N^{m(v_1)}v_1,\ldots,N^{m(v_k)}v_k\right)\) is a basis of \(\text{ker} N\).

*Proof*. The proof is left for the reader.

**Example 7.1 **Suppose \(T\in \mathcal{L}(V)\). Prove that if \(U_1,\ldots,U_m\) are subspaces of \(V\) invariant under \(T\), then \(U_1 + \cdots +U_m\) is invariant under \(T\). Suppose \(v\in U_1+\cdots + U_m\). Then there exists \(u_1,\ldots,u_m\) such that \(v=u_1+\cdots +u_m\) with \(u_j\in U_j\). Then \(Tv=T u_1+\cdots + T u_m\). Since each \(U_j\) is invariant under \(T\), \(T u_j \in U_j\), so \(T v \in U_1+ \cdots + U_m\).

**Example 7.2 **Suppose \(T\in \mathcal{L}(V)\). Prove that the intersection of any collection of subspaces of \(V\) invariant under \(T\) is invariant under \(T\). Suppose we have subspaces \(\{U_j\}\) with each \(U_j\) invariant under \(T\). Let \(v\in \cap_j U_j\). Then \(Tv\in U_j\) for each \(j\), and so \(\cap_j U_j\) is invariant under \(T\).

**Example 7.3 **Prove or give a counterexample: if \(U\) is a subspace of \(V\) that is invariant under every operator on \(V\), then \(U=\{0\}\) or \(U=V\). We will prove the contrapositive: if \(U\) is a subspace of \(V\) and \(U\neq \{0\}\) and \(U\neq V\), then there exists an operator \(T\) on \(V\) such that \(U\) is not invariant under \(T\). Let \((u_1,\ldots,u_m)\) be a basis for \(U\), which we extend to a basis \((u_1,\ldots,u_m, v_1,\ldots,v_n)\) of \(V\). The assumption \(U\neq \{0\}\) and \(U\neq V\) means that \(m\geq 1\) and \(n\geq 1\). Define a linear map \(T\) by \(Tu_1=v_1\) and for \(j>1\), \(T u_j=0\). Since \(v_1\not \in U\), the subspace \(U\) is not invariant under the operator \(T\).

**Example 7.4 **Suppose that \(S,T\in \mathcal{L}(V)\) are such that \(S T= T S\). Prove that \(\text{null }(T-\lambda I)\) is invariant under \(S\) for every \(\lambda \in F\). Suppose \(v\in \text{ker}(T-\lambda I)\). Then \(Tv = \lambda v\) and using \(TS=ST\), \(Sv\) satisfies $ T(S v)=S(T v)=S v)=(S v). $ Thus \(S v\in \text{ker}(T-\lambda I)\) and so \(\text{ker} (T-\lambda I)\) is invariant under \(S\).

**Example 7.5 **Define \(T\in \mathcal{L}(F^2)\) by \(T(w,z)=(z,w)\). Find all eigenvalues and eigenvectors of \(T\). Suppose \((w,z)\neq (0,0)\) and \(T(w,z)=(z,w)=\lambda(w,z)\). Then \(z=\lambda w\) and \(w=\lambda z\). Of course this leads to \(w=\lambda z=\lambda^2w\), \(z=\lambda w=\lambda^2 z\). Since \(w\neq 0\) or \(z\neq 0\), we see that \(\lambda^2=1\) so that \(\lambda =\pm 1\). A basis of eigenvectors is \((w_1,z_1)=(1,1)\), \((w_2,z_2)=(-1,1)\) and they have eigenvalues 1 and \(-1\) respectively.

**Example 7.6 **Define \(T\in \mathcal{L}(F^3)\) by \(T(z_1,z_2,z_3)=(2z_2,0,5z_3)\). Find all eigenvalues and eigenvectors of \(T\). Suppose \((z_1,z_2,z_3)\neq (0,0,0)\) and \[
T(z_1,z_2,z_3)=(2z_2,0,5z_3)=\lambda (z_1,z_2,z_3).
\] If \(\lambda=0\) then \(z_2=z_3=0\), and one checks that \(v_1=(0,0,0)\) is an eigenvector with eigenvalue 0. If \(\lambda\neq 0\) then \(z_2=0\), \(2z_2=\lambda z_1=0\), \(5z_3=\lambda z_3\), so \(z_1=0\) and \(\lambda =5\). The eigenvector for \(\lambda=5\) is \(v_2=(0,0,1)\). These are the only eigenvalues and each eigenspace is one dimensional.

**Example 7.7 **Suppose \(n\) is a positive integer and \(T\in \mathcal{L}(\mathbb{F}^n)\) is defined by \[
T(x_1,\ldots,x_n)=(x_1+ \cdots + x_n,\ldots,x_1+\cdots +x_n).
\] Find all eigenvalues and eigenvectors of \(T\). First, any vector of the form \(v_1=(\alpha,\ldots,\alpha)\), for \(\alpha\in \mathbb{F}\), is an eigenvector with eigenvalue \(n\). If \(v_2\) is any vector $v_2=(x_1,,x_n), $ such that \(x_1+\cdots + x_n=0\) then \(v_2\) is an eigenvector with eigenvalue 0. Here are the independent eigenvectors: \(v_1=(1,1,\ldots,1)\), and \(v_n=(1,0,\ldots,0)-E_n\), for \(n\geq 2\) where \(E_n\) denoted the \(n\)-th standard basis vector.

**Example 7.8 **Suppose \(T\in \mathcal{L}(V)\) is invertible and \(0\neq \lambda \in F\). Prove that \(\lambda\) is an eigenvalue of \(T\) if and only if \(\frac{1}{\lambda}\) is an eigenvalue of \(T^{-1}\). Suppose \(v\neq 0\) and \(T v =\lambda v\). Then \(v=T^{-1}T v=\lambda T^{-1}v\), or \(T^{-1}v=\frac{1}{\lambda}v\), and the other direction is similar.

**Example 7.9 **Suppose \(S,T\in \mathcal{L}(V)\). Prove that \(S T\) and \(T S\) have the same eigenvalues. Suppose \(v\neq 0\) and \(STv=\lambda v\). Multiply by \(T\) to get \(T S(Tv)=\lambda T v\). Thus if \(t\neq 0\) then \(\lambda\) is also an eigenvalue of \(TS\), with nonzero=o eigenvector \(Tv\). On the other hand, if \(T v=0\), then \(\lambda =0\) is an eigenvalue of \(ST\). But if \(T\) is not invertible, then \(\text{im} TS \subset\text{im} T\)is not equal to \(V\), so \(TS\) has a nontrivial null space, hence 0 is an eigenvalue of \(TS\).

**Example 7.10 **Suppose \(T\in \mathcal{L}(V)\) is such that every vector in \(V\) is an eigenvector of \(T\). Prove that \(T\) is a scalar multiple of the identity operator. Pick a basis \((v_1,\ldots,v_N)\) for \(V\). By assumption, \(T v_n=\lambda_n v_n\). Pick any two distance indices, \(m,n\). We also have \(T(v_m+v_n)=\lambda(v_m+v_n)=\lambda(v_m+v_n)=\lambda_m v_m + \lambda_n v_n\). Write this as \(0=(\lambda-\lambda_m)v_m+(\lambda-\lambda_n)v_n\). Since \(v_m\) and \(v_n\) are independent, \(\lambda=\lambda_m=\lambda_n\), and all the \(\lambda_n\) are equal.

**Example 7.11 **Suppose \(S, T\in \mathcal{L}(V)\) and \(S\) is invertible. Prove that if \(p \in \mathcal{P}(\mathbb{F})\) is a polynomial, then \(p(S T S^{-1})=S p(T) S^{-1}\). First let’s show that for positive integers \(n\), \((STS^{-1})^n=S T^n S^{-1}\). We may do this by induction, with nothing to show if \(n=1\). Assume it’s true for \(n=k\), and consider \[
(STS^{-1})^{k+1}=(STS^{-1})^k(STS^{-1})=ST^k S^{-1}STS^{-1}=S T^{k+1}S^{-1}.
\] Now suppose \(P(z)=a_n z^N+\cdots + a_1 z+a_0.\) Then \[\begin{align*}
p(STS^{-1})&=\sum_{n=0}^N a_n (STS^{-1})^n=\sum_{n=0}^N a_n ST^n S^{-1}\\ & =S\left( \sum_{n=0}^N a_n T^n \right) S^{-1}=Sp(T)S^{-1}.
\tag*{}
\end{align*}\]

**Example 7.12 **Suppose \(F=\mathbb{C}\), \(T\in \mathcal{L}(V)\), \(p\in \mathcal{P}(\mathbb{C})\), and \(a\in \mathbb{C}\). Prove that \(a\) is an eigenvalue of \(p(T)\) if and only if \(a=p(\lambda)\) for some eigenvalue $$ of \(T\). Suppose first that \(v\neq 0\) is an eigenvalue of \(T\) with eigenvalue \(\lambda\); that is \(T v = \lambda v\). Then for positive integers \(n\), \(T^n v=T^{n-1} \lambda v = \cdots \lambda^n v\), and so \(p(T)v=p(\lambda) v\). That is \(\alpha=p(\lambda)\) is an eigenvalue of \(p(T)\) if \(\lambda\) is an eigenvalue of \(T\). Conversely, suppose now that \(\alpha\) is a eigenvalue of \(p(T)\), so there is a \(v\neq 0\) with \(p(T)v =\alpha v\), or \((p(T)-\alpha I)v=0\). Since \(\mathbb{F}=\mathbb{C}\), we may factor the polynomial $p(T)-I $ into linear factors \[
0=(p(T)-\alpha I)v=\prod (T-\lambda_n I)v.
\] At least one of the factors is not invertible, so at least one of the \(\lambda_n\), say \(\lambda_1\), is an eigenvalue of \(T\). Let \(w\neq 0\) be an eigenvector for \(T\) with eigenvalue \(\lambda_1\). Then \[
0=(T-\lambda_N I)\cdots (T-\lambda_1 I)w=(p(T)-\alpha I) w,
\] so \(w\) is an eigenvalue for \(p(T)\) with eigenvalue \(\alpha\). But by the first part of the argument, \(p(T)w=p(\lambda_1)w=\alpha w\) and \(\alpha=p(\lambda_1)\).

**Example 7.13 **Show that the previous exercise does not hold with \(F=\mathbb{R}\). Take \(T: \mathbb{R}^2 \rightarrow \mathbb{R}^2\) given by \(T(x,y)=(-y,x)\). We’ve seen perviously that \(T\) has no real eigenvalues. On the other hand, \(T^2(x,y)=(-x,-y)=-1(x,y)\).

**Example 7.14 **Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Prove that \(T\) has an invariant subspace of dimension \(j\) for each \(j=1,\ldots, \text{dim} V\). Let \((v_1,\ldots,v_N)\) be a basis with respect to which \(T\) has an upper triangular matrix. Then by a previous proposition, \(T: \text{span}(v_1,\ldots,v_j) \rightarrow \text{span}(v_1,\ldots,v_j)\).

**Example 7.15 **Give an example of an operator whose matrix with respect to some basis contains only 0’s on the diagonal, but the operator is invertible. Consider \(T=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\).

**Example 7.16 **Give an example of an operator whose matrix with respect to some basis contains only nonzero numbers on the diagonal, but the operator is not invertible. Taking \(T=\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\). If \(v=[1,-1]\), then \(Tv=0\).

**Example 7.17 **Give an example of an operator on \(\mathbb{C}^4\) whose characteristic and minimal polynomials both equal \(z (z-1)^2(z-3)\).

**Example 7.18 **Give an example of an operator on \(\mathbb{C}^4\) whose characteristic polynomial equals \(z (z-1)^2(z-3)\) and whose minimal polynomial equals \(z (z-1)(z-3)\).

**Example 7.19 **Suppose \(a_0, \ldots, a_{n-1}\in \mathbb{C}\). Find the minimal and characteristic polynomials of the operator on \(\mathbb{C}^n\) whose matrix is \[
\begin{bmatrix}
0 & & & & & -a_0 \\
1 & 0 & & & & -a_1 \\
& 1 & \ddots & & & -a_2 \\
& & \ddots & & & \vdots \\
& 1 & & & 0 & -a_{n-2} \\
& & & & 1 & -a_{n-1}
\end{bmatrix}
\] with respect to the standard bases.

## 7.2 Jordan Canonical Form

A basis of \(V\) is called a **Jordan basis** for \(T\) if with respect to this basis \(T\) has block diagonal matrix \[
\begin{bmatrix}
A_1 & & 0 \\
& \ddots & \\
0 & & A_m \\
\end{bmatrix}
\] where each \(A_j\) is an upper triangular matrix of the form \[
A_j =
\begin{bmatrix}
\lambda_j & 1 & & 0 \\
& \ddots & \ddots & \\
& & \ddots & 1 \\
0 & & & \lambda_j \\
\end{bmatrix}
\] where the diagonal is filled with some eigenvalue \(\lambda_j\) of \(T\).

Because there exist operators on real vector spaces that have no eigenvalues, there exist operators on real vector spaces for which there is no corresponding Jordan basis.

**Theorem 7.15 **Suppose \(V\) is a complex vector space. If \(T\in \mathcal{L}(V)\), then there is a basis of \(V\) that is a Jordan basis for \(T\).

*Proof*. The proof is left for the reader.

A basis of \(V\) is called a **Jordan basis** for \(T\) if with respect to this basis \(T\) has block diagonal matrix \[
\begin{bmatrix}
A_1 & & 0 \\
& \ddots & \\
0 & & A_m \\
\end{bmatrix}
\] where each \(A_j\) is an upper triangular matrix of the form \[
A_j =
\begin{bmatrix}
\lambda_j & 1 & & 0 \\
& \ddots & \ddots & \\
& & \ddots & 1 \\
0 & & & \lambda_j \\
\end{bmatrix}
\] where the diagonal is filled with some eigenvalue \(\lambda_j\) of \(T\).

An operator \(T\) can be put into Jordan canonical form if its characteristic and minimal polynomials factor into linear polynomials. this is always true if the vector space is complex.

**Theorem 7.16 **Let \(T\in\mathcal{L}(V)\) whose characteristic and minimal polynomials are, respectively, \[
c(t)=(t-\lambda_1)^{n_1} \cdots (t-\lambda_r)^{n_r})
\qquad \text{and} \qquad
m(t)=(t-\lambda_1)^{m_1} \cdots (t-\lambda_r)^{m_r})
\] where the \(\lambda_i\) are distinct scalars. Then \(T\) has block diagonal matrix representation \(J\) whose diagonal entries are of the form \[
J_{ij}=
\begin{bmatrix}
\lambda_i & 1 & 0 & \cdots & 0 & 0 \\
0 & \lambda_i & 1 & \cdots & 0 & 0 \\
\vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & \lambda_i & 1 \\
0 & 0 & 0 & \cdots & 0 & \lambda_i
\end{bmatrix}
\] For each \(\lambda_i\) the corresponding blocks have the following properties:

- There is at least one \(J_{ij}\) of order \(m_i\); all other \(J_{ij}\) are of order \(\leq m_i\)
- The sum of the orders of the \(J_{ij}\) is \(n_i\).
- The number of \(J_{ij}\) equals the geometric multiplicity of \(\lambda_i\).
- The number of \(J_{ij}\) of each possible order is uniquely determined by \(T\).

*Proof*. The proof is left for the reader.

The matrix \(J\) in the above proposition is called the **Jordan canonical form** of the operator \(T\). A diagonal block \(J_{ij}\) is called a **Jordan block** belongng to the eigenvalue \(\lambda_i\). Observe that \[
\begin{bmatrix}
\lambda_i & 1 & 0 & \cdots & 0 & 0 \\
0 & \lambda_i & 1 & \cdots & 0 & 0 \\
\vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & \lambda_i & 1 \\
0 & 0 & 0 & \cdots & 0 & \lambda_i
\end{bmatrix}
=
\begin{bmatrix}
\lambda_i & 0 & 0 & \cdots & 0 & 0 \\
0 & \lambda_i & 0 & \cdots & 0 & 0 \\
\vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & \lambda_i & 0 \\
0 & 0 & 0 & \cdots & 0 & \lambda_i
\end{bmatrix}
+
\begin{bmatrix}
0 & 1 & 0 & \cdots & 0 & 0 \\
0 & 0 & 1 & \cdots & 0 & 0 \\
\vdots & \vdots & \ddots & \ddots & \vdots & \vdots \\
0 & 0 & 0 & \cdots & 0 & 1 \\
0 & 0 & 0 & \cdots & 0 & 0
\end{bmatrix}
\] That is, \(J_{ij}=\lambda_i I+N\) where \(N\) is the nilpotent block.

**Example 7.20 **Suppose the characteristic and minimum polynomials of an operator \(T\) are, respectively, \[
c(t)=(t-2)^4(t-3)^3 \qquad \text{and} \qquad m(t)=(t-2)^2(t-3)^2.
\] Then the Jordan canonical form of \(T\) is one of the following matrices: \[
\begin{bmatrix}
2 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 2 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 2 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 2 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 3 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 3 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 3
\end{bmatrix}
\qquad \text{or} \qquad
\begin{bmatrix}
2 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 2 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 2 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 2 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 3 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 3 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 3
\end{bmatrix}
\] The first matrix occurs if \(T\) has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix occurs if \(T\) has three independent eigenvectors belonging to 2.

**Example 7.21 **Suppose \(N\in \mathcal{L}(V)\) is nilpotent. Prove that the minimal polynomial of \(N\) is \(z^{m+1}\), where \(m\) is the length of the longest consecutive string of \(1'\text{s}\) that appears on the line directly above the diagonal in the matrix of \(N\) with respect to any Jordan basis for \(N\).

**Example 7.22 **Suppose \(V\) is a complex vector space and \(T\in \mathcal{L}(V)\). Prove that there does not exist a direct sum decomposition of \(V\) into two proper subspaces invariant under \(T\) if and only if the minimal polynomial of \(T\) is of the form \((z-\lambda)^{\text{dim} V}\) for some \(\lambda \in \mathbb{C}\).

**Example 7.23 **Suppose \(T\in \mathcal{L}(V)\) and \((v_1,\ldots,v_n)\) is a basis of \(V\) that is a Jordan basis for \(T\). Describe the matrix of \(T\) with respect to the basis \((v_n,\ldots,v_1)\) obtained by reversing the order of the \(v\)’s.

**Example 7.24 **Consider a 2-by-2 matrix of real numbers \[
\begin{bmatrix}
a & c \\
b & d
\end{bmatrix}.
\]

**Example 7.25 **Suppose \(A\) is a block diagonal matrix \[
A=\begin{bmatrix}
A_1 & & 0 \\
& \ddots & \\
0 & & A_m
\end{bmatrix},
\] where each \(A_j\) is a square matrix. Prove that the set of eigenvalues of \(A\) equals the union of the eigenvalues of \(A_1,\ldots,A_m\).

**Example 7.26 **Suppose \(A\) is a block upper-triangular matrix \[
A=
\begin{bmatrix}
A_1 & & * \\
& \ddots & \\
0 & & A_m
\end{bmatrix},
\] where each \(A_j\) is a square matrix. Prove that the set of eigenvalues of \(A\) equals the union of the eigenvalues of \(A_1\),,\(A_m\).

**Example 7.27 **Suppose \(V\) is a real vector space and \(T\in \mathcal{L}(V)\). Suppose \(\alpha, \beta \in \mathbb{R}\) are such that \(T^2+\alpha T+\beta I=0\). Prove that \(T\) has an eigenvalue if and only if \(\alpha^2 \geq 4 \beta\).

**Example 7.28 **Suppose \(V\) is a real inner-product space and \(T\in \mathcal{L}(V)\). Prove that there is an orthonormal basis of \(V\) with respect to which \(T\) has a block upper-triangular matrix \[
\begin{bmatrix}
A_1 & & * \\
& \ddots & \\
0 & & A_m
\end{bmatrix}.
\] where each \(A_j\) is a 1-by-1 matrix or a 2-by-2 matrix with no eigenvalues.

**Example 7.29 **Prove that if \(T\in \mathcal{L}(V)\) and \(j\) is a positive integer such that \(j \leq \text{dim} V\), then \(T\) has an invariant subspace whose dimension equals \(j-1\) or \(j\).

**Example 7.30 **Prove that there does not exist an operator \(T\in \mathcal{L}(\mathbb{R}^7)\) such that \(T^2+T+I\) is nilpotent.

**Example 7.31 **Give an example of an operator \(T\in \mathcal{L}(\mathbb{C}^7)\) such that \(T^2+T+I\) is nilpotent.

**Example 7.32 **Suppose \(V\) is a real vector space and \(T\in \mathcal{L}(V)\). Suppose \(\alpha, \beta \in \mathbb{R}\) are such that \(\alpha^2< 4\beta\). Prove that null \((T^2+\alpha T + \beta I)^k\) has even dimension for every positive integer \(k\).

**Example 7.33 **Suppose \(V\) is a real vector space and \(T\in \mathcal{L}(V)\). Suppose \(\alpha, \beta \in \mathbb{R}\) are such that \(\alpha^2< 4\beta\) and \(T^2+\alpha T+\beta I\) is nilpotent. Prove that \(\text{dim} V\) is even and \((T^2+\alpha T+\beta I)^{\text{dim} V/2}=0.\)

**Example 7.34 **Prove that if \(T\in \mathcal{L}(\mathbb{R}^3)\) and 5, 7 are eigenvalues of \(T\), then \(T\) has no eigenpairs.

**Example 7.35 **Suppose \(V\) is a real vector space with $ V =n $ and \(T\in \mathcal{L}(V)\) is such that null $T^{n-2}$ null \(T^{n-1}\). Prove that if \(T\) has at most two distinct eigenvalues and that \(T\) has no eigenpairs.

**Example 7.36 **Prove that 1 is an eigenvalue of every square matrix with the property that the sum of the entries in each row equals 1.

**Example 7.37 **Suppose \(V\) is a real vector space with $ V =2 $. Prove that if \[
\begin{bmatrix}
a & c \\
b & d
\end{bmatrix}
\] is the matrix of \(T\) with respect to some basis of \(V\), then the characteristic polynomial of \(T\) equals \((z-a)(z-d)-b c\).

**Example 7.38 **Suppose \(V\) is a real inner-product space and \(S\in \mathcal{L}(V)\) is an isometry. Prove that if \((\alpha, \beta)\) is an eigenpair of \(S\), then \(\beta=1\).