Sylvester’s Criterion for Positive-Semidefinite Matrices II

We are now ready to prove the sufficiency of Sylvester’s criterion, which will complete the work we started in my last post. Set k:=\text{rank}(A). If k=0, then there is nothing to prove, so suppose that k\geq1. Denote the sum of the principal minors of size r by E_{r}(A). By our hypothesis that the principal minors are nonnegative, E_{r}(A)\geq 0. Since there exists a nonsingular principal k\times k minor, we see that E_{k}(A)>0.

To complete the proof, we will show that A has only nonnegative eigenvalues \lambda_{1},\ldots,\lambda_{n} by examining the roots of the characteristic polynomial p_{A}.

Lemma 4. The characteristic polynomial p_{T} of a linear operator T:V\rightarrow V between finite-dimensional vector spaces can be expressed as

\displaystyle p_{T}(z)=\sum_{k=0}^{n}(-1)^{k}E_{k}(T)z^{n-k}

where E_{k}, for k=1,\ldots,n, is the sum of all the j\times j principal minors.

Proof. Recall that a linear operator S: V\rightarrow V on a finite-dimensional vector space induces a linear transformation on the k^{th} exterior algebra

\displaystyle\Lambda^{k}(S): \Lambda^{k}(V)\rightarrow\Lambda^{k}(V),\indent\Lambda^{k}(S)\left(v_{1}\wedge\cdots\wedge v_{k}\right)=Sv_{1}\wedge\cdots\wedge Sv_{k}

Since the n^{th} exterior algebra is a 1-dimensional vector space, the linear transformation \Lambda^{n}(S) is a scalar multiple of the identity. We define the determinant to be the scalar. Taking S=zI-T, for z\in\mathbb{C},

\begin{array}{lcl}\displaystyle\Lambda^{n}(zI-T)\left(e_{1}\wedge\cdots\wedge e_{n}\right)&=&\displaystyle(zI-T)e_{1}\wedge\cdots\wedge(zI-T)e_{n}\\[2 em]&=&\displaystyle\sum_{k=0}^{n}(-1)^{k}z^{n-k}\sum_{i_{1}<\cdots<i_{k}}(-1)^{i_{1}+\cdots+i_{k}-1-\cdots-k}Te_{i_{1}}\wedge\cdots\wedge Te_{i_{k}}\wedge\bigwedge_{{j=1}\atop{j\neq i_{1},\ldots,i_{k}}}^{n}e_{j}\\[2 em]&=&\displaystyle\sum_{k=0}^{n}(-1)^{k}z^{n-k}\sum_{i_{1}<\cdots<i_{k}}(-1)^{i_{1}+\cdots+i_{k}-1-\cdots-k}\bigwedge_{j=1}^{k}\left(a_{1i_{j}}e_{1}+\cdots+a_{ni_{j}}e_{n}\right)\wedge\bigwedge_{{j=1}\atop{j\neq i_{1},\ldots,i_{k}}}^{n}e_{j}\\[2 em]&=&\displaystyle\sum_{k=0}^{n}(-1)^{k}z^{n-k}\sum_{i_{1}<\cdots<i_{k}}(-1)^{i_{1}+\cdots+i_{k}-1-\cdots-k}\bigwedge_{j=1}^{k}\left(a_{i_{1}i_{j}}e_{i_{1}}+\cdots+a_{i_{k}i_{j}}e_{i_{k}}\right)\wedge\bigwedge_{{j=1}\atop{j\neq i_{1},\ldots,i_{k}}}^{n}e_{j}\end{array}

Define a k-dimensional vector space W:=\text{span}\left\{e_{i_{1}},\ldots,e_{i_{k}}\right\}and an operator S_{i_{1}\cdots i_{k}}:W\rightarrow W defined by

\displaystyle S_{i_{1}\cdots i_{k}}e_{i_{j}}=a_{i_{1}i_{j}}e_{i_{1}}+\cdots+a_{i_{k}i_{j}}e_{i_{k}},\indent\forall j=1,\ldots,k

In other words, S_{i_{1}\cdots i_{k}} is the operator whose matrix with respect to the basis \left\{e_{i_{1}},\ldots,e_{i_{k}}\right\} is the k\times k principal submatrix with index set I=\left\{i_{1},\ldots,i_{k}\right\}. The determinant of S_{i_{1}\cdots i_{k}} is the scalar of the rank-1 operator

\begin{array}{lcl}\displaystyle\Lambda^{k}(S_{i_{1}\cdots i_{k}})\left(e_{i_{1}}\wedge\cdots\wedge e_{i_{k}}\right)=S_{i_{1}\cdots i_{k}}e_{i_{1}}\wedge\cdots\wedge S_{i_{1}\cdots i_{k}}e_{i_{k}}&=&\displaystyle\bigwedge_{j=1}^{k}\left(a_{i_{1}i_{j}}e_{i_{1}}+\cdots+a_{i_{k}i_{j}}e_{i_{k}}\right)\\[2 em]&=&\displaystyle\det(S_{i_{1}\cdots i_{k}})e_{i_{1}}\wedge\cdots\wedge e_{i_{k}}\end{array}

Equivalently, \det(S_{i_{1}\cdots i_{k}}) is the k\times k principal minor with index set I=\left\{i_{1},\ldots,i_{k}\right\}. Substituting this result into our last expression for \Lambda^{k}(zI-T)(e_{1}\wedge\cdots\wedge e_{n}), we see that

\displaystyle\Lambda^{n}(zI-T)\left(e_{1}\wedge\cdots\wedge e_{n}\right)=\sum_{k=0}^{n}(-1)^{k}z^{n-k}\sum_{i_{1}<\cdots<i_{k}}\det(S_{i_{1}\cdots i_{k}})e_{1}\wedge\cdots\wedge e_{n},

which shows that

\displaystyle\det(zI-T)=\sum_{k=0}^{n}(-1)^{k}z^{n-k}\sum_{i_{1}<\cdots<i_{k}}\det(S_{i_{1}\cdots i_{k}})

Note that \sum_{i_{1}<\cdots<i_{k}}\det(S_{i_{1}\cdots i_{k}}) is the sum of all the k\times k principal minors, so

\displaystyle p_{T}(z)=z^{n}-E_{1}(A)z^{n-1}+\cdots+(-1)^{n-1}E_{n-1}(A)z+(-1)^{n}E_{n}(A)


Since the matrix A has rank k, E_{r}(A)=0, for r>k. Hence, the characteristic polynomial simplifies to

\displaystyle p_{A}(z)=z^{n}-E_{1}(A)z^{n-1}+\cdots+(-1)^{k}E_{k}(A)z^{n-k}=z^{n-k}\left(z^{k}-E_{1}(A)z^{k-1}+\cdots+(-1)^{k}E_{k}(A)\right)

Define a polynomial q_{A} by q_{A}(z):=z^{k-n}p_{A}(z). Since A is Hermitian, all the eigenvalues of A are real; equivalently, all the roots of the polynomials p_{A} and q_{A} are real. Since E_{k}(A)>0, the polynomial q_{A} only has nonzero roots. By Descartes’ rule of signs, the number of negative roots of q_{A} is equal to the number of sign changes of the  nonzero coefficients of the polynomial


But \tilde{q}_{A} has no sign changes, since all the E_{j}(A) are nonnegative. We conclude that all the roots of q_{A} are positive, which implies that all roots of the characteristic polynomial p_{a} are nonnegative. Equivalently, all the eigenvalues of A are nonnegative, which implies that A is positive-semidefinite.

This entry was posted in math.AC and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s