Sylvester’s Criterion for Positive-Semidefinite Matrices I

A few weeks ago, we proved that a n\times n Hermitian matrix is positive-definite if and only if its n leading principal minors are positive. This result is known as Sylvester’s criterion, after nineteenth-century English mathematician J.J. Sylvester. Disappointingly, no one asked whether we can relax the hypothesis to positive-semidefinite. The answer is no, not if only expect to have check the leading principal minors. Instead, we have to check whether all of the principal minors are nonnegative. This added burden is probably an indication that Sylvester’s criterion is not computationally useful when n is large. Perhaps, someone more knowledgeable about numerical analysis will weigh in on this matter.

Here’s the statement for generalized Sylvester’s criterion:

Theorem 1. A n\times n Hermitian matrix A=(a_{ij}) is positive-semidefinite if and only if all the principal minors are nonnegative.

Unfortunately, the proof is not as elegant as the one for Sylvester’s criterion for positive-definiteness. We’ll break up into a few lemmas to simplify the exposition beginning with the matrix product representation of a k\times k principal submatrix.

In case you’re human and you forget things, a k\times k submatrix A_{IJ} is constructed from A as follows: take a subset of k column indices J=\left\{j_{1},\ldots,j_{k}\right\} and a subset of k row indices I=\left\{i_{1},\ldots,i_{k}\right\} and define A_{IJ} by

\displaystyle A=\begin{bmatrix}a_{i_{1}j_{1}}&\cdots&a_{i_{1}j_{k}}\\ \vdots&\vdots&\vdots\\a_{i_{k}j_{1}}&\cdots&a_{i_{k}j_{k}}\end{bmatrix}

If I=J, then we say that submatrix is principal. The associated minor is the determinant of the matrix A_{IJ}.

A useful way to think about minors is from the perspective of linear transformations. Let I and J be as above, let e_{1},\ldots,e_{n} denote the standard basis for \mathbb{F}^{n}, let \tilde{e}_{1},\ldots,\tilde{e}_{k} denote the standard basis for \mathbb{F}^{k}, and let \langle{\cdot,\cdot}\rangle denote the Euclidean inner product. Define operators P_{I}, P_{J}:\mathbb{F}^{n}\rightarrow\mathbb{F}^{k} by

\displaystyle P_{I}v=\langle{v,e_{i_{1}}}\rangle e_{i_{1}}+\cdots+\langle{v,e_{i_{k}}}\rangle e_{i_{k}},\indent\forall v\in\mathbb{F}^{n}

and

\displaystyle P_{J}v=\langle{v,e_{j_{1}}}\rangle e_{j_{1}}+\cdots+\langle{v,e_{j_{k}}}\rangle e_{j_{k}},\indent\forall v\in\mathbb{F}^{n}

We’re abusing notation slightly above by identifying \mathbb{F}^{k} with an isomorphic subspace of \mathbb{F}^{n}. For r=1,\ldots,k, the adjoint P_{J}^{*} maps \tilde{e}_{r} to e_{j_{r}}. So,

\begin{array}{lcl}\displaystyle P_{I}AP_{J}^{*}\tilde{e}_{r}=P_{I}Ae_{j_{r}}&=&P_{I}\left(a_{1j_{r}}e_{1}+\cdots+a_{nj_{r}}e_{n}\right)\\[.9 em]&=&\displaystyle a_{1j_{r}}P_{I}e_{1}+\cdots+a_{nj_{r}}P_{I}e_{n}\\[.9 em]&=&\displaystyle a_{i_{1}j_{r}}e_{i_{1}}+\cdots+a_{i_{k}j_{r}}e_{i_{k}}\end{array}

We can therefore write A_{IJ}=P_{I}AP_{J}^{*}.

The necessity of Sylvester’s criterion is the easy part of the proof. If A is positive-semidefinite, then

\displaystyle\langle{A_{I}w,w}\rangle=\langle{P_{I}AP_{I}^{*}w,w}\rangle=\langle{AP_{I}^{*}w,P_{I}^{*}w}\rangle=\langle{Av,v}\rangle\geq0,

where v:=P_{I}^{*}w\in\mathbb{C}^{n}. Hence, the principal submatrix A_{I} is positive-semidefinite and therefore has nonnegative determinant (all nonnegative eigenvalues).

For sufficiency, we’ll need the existence of a nonsingular principal submatrix of a positive-semidefinite matrix.

Lemma 2. Let A=(a_{ij}) be an n\times m complex matrix with rank k>0. If v_{1},\ldots,v_{k} are k linearly independent rows of A and w_{1},\ldots,w_{k} are k linearly independent columns of A, then the k\times k matrix formed by striking the remaining n-k rows and m-k columns of A is nonsingular.

By hypothesis, the n-tuples

\displaystyle w_{1}=\begin{bmatrix}a_{1j_{1}}\\ \vdots\\a_{nj_{1}}\end{bmatrix},\ldots,w_{k}=\begin{bmatrix}a_{1j_{k}}\\ \vdots\\a_{nj_{k}}\end{bmatrix}

where j_{1}<\cdots<j_{k}, and the m-tuples

\displaystyle v_{1}=\begin{bmatrix} a_{i_{1}1}&\cdots&a_{i_{1}m}\end{bmatrix},\ldots,v_{k}=\begin{bmatrix}a_{i_{k}1}&\cdots&a_{i_{k}m}\end{bmatrix}

where i_{1}<\cdots<i_{k}, are linearly independent in \mathbb{R}^{n} and \mathbb{R}^{m}, respectively. We need to show that the k\times k matrix

\displaystyle A_{k}=\begin{bmatrix}a_{i_{1}j_{1}}&\cdots&a_{i_{1}j_{k}}\\ \vdots&\vdots&\vdots\\a_{i_{k}j_{1}}&\cdots&a_{i_{k}j_{k}}\end{bmatrix}

is nonsingular.

Consider the matrix n\times k matrix U consisting of the column vectors w_{1},\ldots,w_{k}, which has rank k since the column vectors were hypothesized to be linearly independent.

\displaystyle U=\begin{bmatrix}a_{1j_{1}}&\cdots&a_{1j_{k}}\\a_{2j_{1}}&\cdots&a_{2j_{k}}\\ \vdots&\vdots&\vdots\\a_{nj_{1}}&\cdots&a_{nj_{k}}\end{bmatrix}

Since the row rank equals the column rank, we see that v_{1},\ldots,v_{k} form a basis for the row space of U. Hence, for every row i of U, there exist scalars b_{ir} such that

\begin{array}{lcl}\displaystyle\begin{bmatrix}a_{ij_{1}}&\cdots&a_{ij_{k}}\end{bmatrix}&=&\displaystyle b_{i1}\begin{bmatrix}a_{i_{1}j_{1}}&\cdots&a_{i_{1}j_{k}}\end{bmatrix}+\cdots+b_{ik}\begin{bmatrix}a_{i_{k}j_{1}}&\cdots&a_{i_{k}j_{k}}\end{bmatrix}\\[2 em]&=&\displaystyle\begin{bmatrix}{\sum_{r=1}^{k}b_{ir}a_{i_{r}j_{1}}}&\cdots&{\sum_{r=1}^{k}b_{ir}a_{i_{r}j_{k}}}\end{bmatrix}\end{array}

The scalars b_{ir}, where 1\leq i\leq m and 1\leq r\leq k, form an n\times k matrix B. Moreover, we can write U as the product U=BA_{k}. Observe that we have the inequalities

\displaystyle k=\text{Rank}(U)\leq\min\left\{\text{Rank}(B),\text{Rank}(A_{k})\right\}\text{ and }\text{Rank}(A_{k})\leq k

which imply that \text{Rank}(A_{k})=k. Equivalently, A_{k} is nonsingular. \Box

Moreover, any k'\times k' submatrix, where k'>k, is singular. To see this, we appeal to the product decomposition A_{IJ} = P_{I}AP_{J}^{*}, where I and J are subsets of k' rows and columns, respectively. The matrix P_{J}^{*} has rank k', A has rank k, and P_{I} has rank k'. Hence,

\displaystyle\text{rank}(P_{I}AP_{J}^{*})\leq\min\left\{\text{rank}(P_{I}),\text{rank}(AP_{J}^{*})\right\}\leq\min\left\{\text{rank}(A),\text{rank}(P_{J}^{*})\right\}=k

Corollary 3. Suppose an n\times n Hermitian matrix A has rank k>0. Then there exists a nonsingular k\times k principal submatrix of A.

Proof. Suppose v_{1},\ldots,v_{k} are linearly independent column vectors of A. I claim that the adjoint vectors v_{1}^{*},\ldots,v_{k}^{*} are also linearly independent. Indeed, suppose there are scalars \alpha_{1},\ldots,\alpha_{k} such that

\begin{bmatrix} 0\\ \vdots\\ 0\end{bmatrix}=\alpha_{1}\begin{bmatrix}\overline{v}_{11}\\ \vdots\\ \overline{v}_{n1}\end{bmatrix}+\cdots+\alpha_{k}\begin{bmatrix}\overline{v}_{1k}\\\vdots\\ \overline{v}_{nk}\end{bmatrix}=\begin{bmatrix}\overline{\overline{\alpha}_{1}v_{11}+\cdots+\overline{\alpha}_{k}v_{1k}}\\ \vdots\\ \overline{\overline{\alpha}_{1}v_{n1}+\cdots+\overline{\alpha}_{k}v_{nk}}\end{bmatrix}

Since v_{1},\ldots,v_{k} are linearly independent and for any scalars \beta_{1},\ldots,\beta_{k},

\beta_{1}\begin{bmatrix} v_{11}\\ \vdots\\ v_{n1}\end{bmatrix}+\cdots+\beta_{k}\begin{bmatrix}v_{1k}\\ \vdots\\ v_{nk}\end{bmatrix}=\begin{bmatrix} {\beta_{1}v_{11}+\cdots+\beta_{k}v_{1k}}\\ \vdots\\ {\beta_{1}v_{n1}+\cdots+\beta_{k}v_{nk}}\end{bmatrix},

taking \beta_{1}=\overline{\alpha}_{1},\ldots,\beta_{k}=\overline{\alpha}_{k}, we conclude that v_{1}^{*},\ldots,v_{k}^{*} are linearly independent.

By definition of the rank of a matrix, there exists k columns with indices j_{1},\ldots,j_{k} that are linearly independent. Since A is Hermitian, the rows with indices i_{1}:=j_{1},\ldots,i_{k}:=j_{k} are linearly independent. By the preceding lemma, the resulting principal submatrix is nonsingular. \Box

References

Roger Horn and Charles R. Johnson, Matrix Analysis (Second Edition), Cambridge UP, Cambridge, 2012.

Advertisements
This entry was posted in math.GM and tagged , , , , , . Bookmark the permalink.

3 Responses to Sylvester’s Criterion for Positive-Semidefinite Matrices I

  1. Pingback: Sylvester’s Criterion for Positive-Semidefinite Matrices II | Math by Matt

  2. Pierluigi says:

    If you want another criterion the includes both cases and it is computationally much more simple just send me an e-mail (that it is not public but that you can see, I guess)
    P.

  3. Anonymous says:

    Finally I remebered, the theorem is:
    Theorem 3.3.12 pg 96 in
    Mokhtar S. Bazaraa, Hanif D. Sherali, C Shetty (M.)
    Nonlinear Programming: Theory and Algorithms
    Second Edition
    John Wiley and Sons, 2004 – 638 pagine

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s