## Sylvester’s Criterion for Positive Definiteness

Suppose $A=(a_{ij})$ is an $n\times n$ complex matrix. For $1\leq k\leq n$, we define the $k^{th}$ leading principal submatrix of $A$ to be the $k\times k$ matrix

$\displaystyle A_{k}:=\begin{bmatrix}a_{11}&\cdots&a_{1k}\\ \vdots&\ddots&\vdots\\a_{k1}&\cdots&a_{kk}\end{bmatrix}$

We define the $k^{th}$ leading principal minor of $A$ to be the determinant of $A_{k}$.

I prefer to think of $n\times n$ matrices as operators on $\mathbb{R}^{n}$ or $\mathbb{C}^{n}$, so what is the operator interpretation of the $k^{th}$ principal submatrix? Let $P:\mathbb{R}^{n}\rightarrow\mathbb{R}^{k}$ be the projection map of a vector onto its first $k$ coordinates with respect to the standard basis $e_{1},\ldots,e_{n}$. More precisely,

$\displaystyle Pv=P\left(\langle{v,e_{1}}\rangle e_{1}+\cdots+\langle{v,e_{n}}\rangle e_{n}\right)=\langle{v,e_{1}}\rangle e_{1}+\cdots+\langle{v,e_{k}}\rangle e_{k},\indent\forall v\in\mathbb{R}^{n}$

We can write $P$ as a matrix with respect to the standard bases of $\mathbb{R}^{n}$ and $\mathbb{R}^{k}$:

$P=\begin{bmatrix}1&0&\cdots&0&0&\cdots&0\\ 0&1&\cdots&0&0&\cdots&0\\ 0&0&\ddots&0&0&\cdots&0\\ 0&0&\cdots&1&0&\cdots&0\end{bmatrix},\indent P^{T}=\begin{bmatrix}1&0&\cdots&0\\ 0&1&\cdots&0\\ 0&0&\ddots&0\\ 0&0&\cdots&1\\ 0&0&\cdots&0\\ \vdots&\vdots&\vdots&\vdots\\ 0&0&\cdots&0\end{bmatrix}$

Then $A_{k}=PAP^{T}$.

Why are leading principal minors useful? Well, suppose that you have located a critical point $\mathbf{x}=(x_{1},\ldots,x_{n})\in\mathbb{R}^{n}$ of a smooth function $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$. You want to determine whether or not this critical point is a local minimum. So, you compute the Hessian matrix $H(f)(\mathbf{x})$ at the point $\mathbf{x}$. The second derivative test says that $\mathbf{x}$ is a local minimum if and only if the matrix $H(f)(\mathbf{x}$ is positive definite:

$\displaystyle\mathbf{y}H(f)(\mathbf{y})\mathbf{y}^{T}>0,\indent\forall\mathbf{y}\neq0\in\mathbb{R}^{n}$

Second-Derivative Test. Suppose $f:U\rightarrow\mathbb{R}$ is a $C^{2}$ function on an open set $U\subset\mathbb{R}^{n}$. If $x$ is a critical point of $f$ and $Hf(x)$ is positive definite, then $x$ is a strict local minimizer of $f$ on $U$.

Proof. Set $A:=Hf(x)$. Since $A$ is positive definite, we have that the function $g(d):=\langle{Ad,d}\rangle>0$, for all $d\neq 0$. Restricting the domain of $g$ to the compact unit sphere $S$, we see from Weierstrass’ theorem that $g$ attains its minimum $\alpha$ at some point $d_{0}\in S$, so $\alpha>0$. Normalizing an arbitrary nonzero vector $d\in\mathbb{R}^{n}$, we see that

$\displaystyle g(d)=\left\|d\right\|^{2}\left\langle{A\frac{d}{\left\|d\right\|},\frac{d}{\left\|d\right\|}}\right\rangle\geq\alpha\left\|d\right\|^{2},\indent\forall\left\|d\right\|\neq0$

Of course, equality also holds if $\left\|d\right\|=0$.

If $\left\|d\right\|$ is sufficiently small, then Taylor’s theorem tells us that

$\begin{array}{lcl}\displaystyle f(x+d)=f(x)+\left\langle{\nabla f(x),d}\right\rangle+\dfrac{1}{2}\left\langle{Ad,d}\right\rangle+o\left(\left\|d\right\|^{2}\right)&=&\displaystyle f(x)+\dfrac{1}{2}g(d)+o\left(\left\|d\right\|^{2}\right)\\&\geq&\displaystyle f(x)+\left\|d\right\|^{2}\left(\dfrac{\alpha}{2}+\dfrac{o(\left\|d\right\|^{2})}{\left\|d\right\|^{2}}\right)\\&>&\displaystyle f(x),\end{array}$

if $\frac{\alpha}{2}+\frac{o(\left\|d\right\|^{2})}{\left\|d\right\|^{2}}>0$. $\Box$

This is where the leading principal minors come in.

Theorem 1. (Sylvester’s Criterion) A real, symmetric matrix $A$ is positive definite if and only if all principal minors are positive.

The proof presented below is adapted from George T. Gilbert’s American Mathematical Monthly article “Positive Definite Matrices and Sylvester’s Criterion.” The full citation can be found below.

For necessity, it suffices to show that the $k^{th}$ principal minor $A_{k}$ is positive definite, for $1\leq k\leq n$. Using the decomposition of $A_{k}$ obtained above and the hypothesis that $A$ is a positive definite matrix, we see that

$\displaystyle\langle{A_{k}v,v}\rangle=\langle{PAP^{T}v,v}\rangle=\langle{AP^{T}v,P^{T}v}\rangle>0,\indent\forall v\neq0\in\mathbb{R}^{k}$

For sufficiency, we first prove two lemmas.

Lemma 2.  Let $v_{1},\ldots,v_{n}$ be a basis of a finite-dimensional vector space $V$. Suppose $W$ is a $k$-dimensional subspace of $V$. If $m, then there exists a nonzero vector in $W$ which belongs to the span of $v_{m+1},\ldots,v_{n}$.

Proof. Let $w_{1},\ldots,w_{k}$ be a basis for $W$. If the statement of the lemma is false, then $w_{1},\ldots,w_{k},v_{m+1},\ldots,v_{n}$ is a basis of $V$, which contradicts that $\dim V=n$. $\Box$

Lemma 3. Let $A$ be an $n\times n$ real, symmetric matrix. Suppose $W$ is a $k$-dimensional subspace of $\mathbb{R}^{n}$ such that

$\displaystyle w^{T}Aw>0,\indent\forall w\neq0\in W$

Then $A$ has at least $k$ (counting multiplicity) positive eigenvalues.

Proof. By the real spectral theorem, there is an orthonormal basis $v_{1},\ldots,v_{n}$ for $\mathbb{R}^{n}$ consisting of eigenvectors of $A$ with corresponding eigenvalues $\lambda_{1},\ldots,\lambda_{n}$. Let $m$ denote the number of positive eigenvalues of $A$, where $0\leq m\leq n$. Assume for the sake of a contradiction that

$\displaystyle w=a_{m+1}v_{m+1}+\cdots+a_{n}v_{n}$

Applying our hypothesis that the quadratic form defined by $A$ is positive on $W$, we obtain

$\begin{array}{lcl}\displaystyle w^{T}Aw=Aw\cdot w&=&\displaystyle\left(a_{m+1}\lambda_{m+1}v_{m+1}+\cdots+a_{n}\lambda_{n}v_{n}\right)\cdot\left(a_{m+1}v_{m+1}+\cdots+a_{n}v_{n}\right)\\&=&\displaystyle\lambda_{m+1}a_{m+1}^{2}+\cdots+\lambda_{n}a_{n}^{2}\\&>&\displaystyle0\end{array}$

However, we assumed that $\lambda_{m+1},\ldots,\lambda_{n}$ are all nonpositive, so that $\sum_{i=m+1}^{n}\lambda_{i}a_{i}^{2}\leq 0$, which is a contradiction. $\Box$

We prove the suffiiency of Sylvester’s criterion by induction on $n$. For the base $n=1$, there is nothing to prove. Suppose that the sufficiency of Sylvester’s criterion is true for all $(n-1)\times(n-1)$ real, symmetric matrices. Let $W$ denote the subspace of $\mathbb{R}^{n}$ consisting of all vectors with zero $n^{th}$ coordinate. Using above, let $P$ denote the projection of $\mathbb{R}^{n}$ onto the subspace spanned by the first $n-1$ coordinates with respect to the standard basis. Then

$\displaystyle w^{T}Aw=\left(w^{T}P\right)A\left(P^{T}w\right)=w^{T}A_{n-1}w>0,\indent\forall w\neq0\in W$

by our induction hypothesis. By the preceding lemma, $A$ has at least $n-1$ positive eigenvalues (counting multiplicity). Therefore the $n^{th}$ eigenvalue is also positive, since otherwise, the $n^{th}$ principal minor $\det A\leq 0$, which is a contradiction.

I’m not sure if Sylvester’s criterion is useful for determining whether an $latex n\times n$ matrix $A$ is positive definite, if $n$ is large; I didn’t study much numerical analysis as an undergraduate student. To apply Sylvester’s criterion, we need to know the signs of $n$ determinants. As far as I know, there is no algorithm that computes the sign of the determinant faster than the value of the determinant. So, we would need to compute $n$ determinants. Laplace expansion, which has arithmetic complexity of $O(k!)$, is infeasible when $n$ is large. So, we would probably use Gaussian elimination, which has arithmetic complexity of $O(k^3)$, to put the leading principal submatrices in upper-triangular form to obtain the determinant.

If we had additional information about the full matrix $A$, then the aforementioned approaches would be unnecessary. For instance, if we could easily find a basis with respect to which $A$ is upper triangular, then we could check positive definiteness by examining the diagonal entries. Recall that a matrix is positive definite if and only if all its eigenvalues are positive, and the diagonal entries of a matrix of an upper triangular matrix are precisely its eigenvalues repeated with multiplicity. Therefore we only need to check whether the diagonal entries are all positive.

This entry was posted in math.OC and tagged , , , , , , , , . Bookmark the permalink.

### 6 Responses to Sylvester’s Criterion for Positive Definiteness

1. dinesh says:

From a numerical perspective, won’t the power method help to determine if it is positive?

• matthewhr says:

If my understanding is correct, the power method only gives you the eigenvalue with the largest magnitude; however, a matrix is positive definite if and only if all its eigenvalues are positive. So, I’m not sure how knowing just one eigenvalue will help.

There may be a multi-step adaptation of the power method, which perhaps is closer to what you have in mind, but I am not familiar with it. If you find out more, please do let me know.

2. Dileep says:

Oh, I was thinking if you have a numerical technique to estimate the lowest eigenvalue (or its corresponding eigenvector), then that would certainly tell you if the matrix is positive or not. I though power method did that. Guess, I was wrong.

3. Pingback: URL

4. Anonymous says:

The spectral decomposition of a real symmetric matrix can be done relatively easily using the Jacobi iteration algorithm, at least for moderate (n x n) matrices. Non-linear optimization objective functions I have encountered are typically less than 20-30 variables, which are well withing the capability of the Jacobi algorithm.