## Self-Adjoint Operators: Monotone Convergence Theorem, Existence of Square Roots

One of the many irons in the fire that is my ongoing mathematical learning is a review of functional analysis motivated by my study of probability theory and stochastic processes. As I’ve mentioned before, I’ve been using the great text by Adam Bobrowski. In his discussion of discrete-time martingales, Bobrowski introduces a convergence theorem for monotone sequences of nonnegative (sometimes called positive semi-definite) operators. I would to like to give an exposition of the theorem, including a full proof. I would then like to give an application of this theorem, proving the existence of nonnegative square root for a nonnegative self-adjoint operator.

We say that a self-adjoint operator $A: \mathbb{H}\rightarrow\mathbb{H}$ is nonnegative if $\langle{Ax,x}\rangle \geq 0$ for all $x\in\mathbb{H}$; we write $A \geq 0$. If $A,B$ are two self-adjoint operators such that $A-B\geq 0$, then we write $A\geq B$.

Lemma 1. If $A$ is nonnegative, then $A^{n}$ is nonnegative for any $n\in\mathbb{Z}^{\geq 0}$.

Proof. If $n$ is even, then

$\displaystyle\langle{A^{n}x,x}\rangle=\langle{A^{\frac{n}{2}}x,A^{\frac{n}{2}}x}\rangle\geq 0,\indent\forall x\in\mathbb{H}$

Similarly, if $n$ is odd and $n > 1$, then

$\displaystyle\langle{A^{n}x,x}\rangle=\langle{A^{\frac{n-1}{2}}A^{\frac{n-1}{2}+1}x,x}\rangle=\langle{A(A^{\frac{n-1}{2}}x),A^{\frac{n-1}{2}}x}\rangle\geq 0,\indent\forall x\in\mathbb{H}$

$\Box$

Lemma 2. (Nonnegative Operator Inequality) Let $A$ be a nonnegative, self-adjoint operator on a Hilbert space $\mathbb{H}$. Then $\langle{A^{2}x,x}\rangle\leq\left\|A\right\|\langle{Ax,x}\rangle$ for all $x\in\mathbb{H}$.

Proof. If $\left\|A\right\|=0$, then there is nothing to prove, so assume otherwise. Observe that proving the inequality above is equivalent to showing $\langle{B^{2}x,x}\rangle\leq\langle{Bx,x}\rangle$, where $B:=\frac{1}{\left\|A\right\|}A$. Since $A$ is nonnegative, $\langle{Bx,x}\rangle\geq 0$. Observe that $I-B$ is self-adjoint, being the difference of self-adjoint operators. By Cauchy-Schwarz,

$\displaystyle\langle{(I-B)x,x}\rangle=\left\|x\right\|^{2}-\langle{Bx,x}\rangle \geq 0,\indent\forall x\in\mathbb{H}$

Since $B-B^{2}=B(I-B)B+(I-B)B(I-B)$, we see that

$\begin{array}{lcl}\displaystyle\langle{(B-B^{2})x,x}\rangle&=&\displaystyle\langle{B(I-B)Bx,x}\rangle+\langle{(I-B)B(I-B)x,x}\rangle\\&=&\displaystyle\langle{(I-B)Bx,Bx}\rangle+\langle{B(I-B)x,(I-B)x}\rangle\\&\geq&\displaystyle 0\end{array}$,

which completes the proof. $\Box$

We now prove a convergence theorem for monotone sequences of self-adjoint operators.

Theorem 3. Suppose $(A_{n})$ is a sequence of self-adjoint operators on a Hilbert space $\mathbb{H}$ such that $A_{n}\leq A_{n+1}\leq MI$, for all $n \in \mathbb{Z}^{\geq 0}$, where $M \in\mathbb{R}$ is some constant. Then there exists a self-adjoint operator $A: \mathbb{H} \rightarrow\mathbb{H}$ such that

$\displaystyle Ax=\lim_{n\rightarrow\infty}A_{n}x,\indent\forall x\in\mathbb{H}$

In other words, $A_{n}$ converges to $A$ strongly (but not necessarily in the operator norm).

Proof. For any $x\in \mathbb{H}$, the real sequence $(\langle{A_{n}x,x}\rangle)_{n=1}^{\infty}$ is nondecreasing and bounded from above by $M\left\|x\right\|^{2}$, hence converges to a real number $F(x)$. For any $x,y\in\mathbb{H}$, the polarization identity lets us write

$\displaystyle\langle{A_{n}x,y}\rangle=\dfrac{1}{4}\left[\langle{A_{n}(x+y),x+y}\rangle-\langle{A_{n}(x-y),x-y}\rangle+i\langle{A_{n}(x+iy),x+iy}\rangle-i\langle{A_{n}(x-iy),x-iy}\rangle\right]$

so the limit $G(x,y) := \lim_{n \rightarrow\infty}\langle{A_{n}x,y}\rangle$ exists.

For any $n\in\mathbb{Z}^{\geq 1}$, we have by Cauchy-Schwarz that

$\displaystyle -\left\|A_{1}\right\|\left\|x\right\|^{2}\leq\langle{A_{n}x,x}\rangle\leq M\left\|x\right\|^{2},\indent\forall x\in\mathbb{H}$

so by Lemma 2, $\left\|A_{n}\right\|\leq M':=\max\left\{M,\left\|A_{1}\right\|\right\}$. By another application of Cauchy-Schwarz, we see that

$\displaystyle\left|G(x,y)\right|\leq M'\left\|x\right\|\left\|y\right\|,\indent\forall x,y\in\mathbb{H}$

Hence, for $x$ fixed, $\overline{G(x,y)}$ is a bounded complex-linear functional with respect to $y\in\mathbb{H}$. By the Riesz representation theorem, there exists a unique element $Ax\in\mathbb{H}$ such that

$\displaystyle\langle{y,Ax}\rangle=\overline{G(x,y)} \Longleftrightarrow G(x,y)=\overline{\langle{y,Ax}\rangle}=\langle{Ax,y}\rangle,\indent\forall y\in\mathbb{H}$

$A$ is complex-linear, since

$\begin{array}{lcl}\displaystyle\langle{A(\alpha x+\beta x'),y}\rangle=G(\alpha x+\beta x',y)=\lim_{n\rightarrow\infty}\langle{A_{n}(\alpha x+\beta x'),y}\rangle&=&\displaystyle\lim_{n\rightarrow\infty}\alpha\langle{A_{n}x,y}\rangle+\beta\langle{A_{n}x',y}\rangle\\&=&\displaystyle\alpha G(x,y)+\beta G(x',y)\\&=&\displaystyle\langle{\alpha Ax+\beta Ax',y}\rangle\end{array}$

for all $y\in\mathbb{H}$. Cauchy-Schwarz implies that $\left\|Ax\right\|\leq M'\left\|x\right\|$, hence $\left\|A\right\|\leq M'$. $A$ is self-adjoint, since $A_{n}$ is self-adjoint for all $n$ and

$\displaystyle\langle{Ax,y}\rangle=\lim_{n\rightarrow\infty}\langle{A_{n}x,y}\rangle=\lim_{n\rightarrow\infty}\langle{x,A_{n}y}\rangle=\langle{x,Ay}\rangle$

for all $x,y\in\mathbb{H}$. Being the difference of two self-adjoint operators, $A-A_{n}$ is self-adjoint and by Lemma 2,

$\displaystyle\left\|(A-A_{n})x\right\|^{2}=\langle{(A-A_{n})^{2}x,x}\rangle\leq\left\|A-A_{n}\right\|\langle{(A-A_{n})x,x}\rangle\leq 2M'\langle{(A-A_{n})x,x}\rangle$

This last quantity converges to $0$ as $n\rightarrow\infty$, for $x$ fixed, which completes the proof. $\Box$

The preceding convergence theorem also applies to decreasing sequences of self-adjoint operators. If $(A_{n})_{n=1}^{\infty}$ is a sequence of self-adjoint operators such that $\langle{A_{n}x,x}\rangle \geq \langle{A_{n+1}x,x}\rangle \geq M\left\|x\right\|^{2}$, for some constant $M$, then the $A_{n}$ converge strongly to a self-adjoint operator $A$. This result follows immediately from applying the preceding theorem to the sequence of operators $B_{n}:=-A_{n}$.

We use the preceding convergence theorem for self-adjoint operators to prove the existence of a square root of self-adjoint operators $A$.

Theorem 4. If $A$ is a self-adjoint operator such that $\langle{Ax,x}\rangle \geq 0$ for all $x\geq 0$, then there exists a self-adjoint operator $B$ such that $B^{2}=A$, and $B$ commutes with all operators that commute with $A$.

Proof. Without loss of generality, we may assume that $0 \leq A \leq I$. Otherwise, $A=0$, in which case the theorem is obvious, or the operator $A':=\frac{1}{\left\|A\right\|}A$ satisfies $0\leq A'\leq I$ and $B= \sqrt{\left\|A\right\|}B'$, where ${B'}^{2}=A'$. Set $C:=I-A$, and observe that $0\leq C\leq I$. We will find $B$ by an interative argument.

Define a sequence of operators inductively by $A_{0}:=0$ and $A_{n+1}:=\dfrac{1}{2}(C+A_{n}^{2})$, for $n \geq 1$. The intuition behind this recursive inductive definition is that if $A_{\infty}=\lim_{n\rightarrow\infty}A_{n}$ exists (in the strong operator topology), then for any $x\in\mathbb{H}$,

$\displaystyle A_{\infty}x=\lim_{n\rightarrow\infty}\dfrac{1}{2}\left(C+A_{n}^{2}\right)x=\dfrac{1}{2}\left(C+A_{\infty}^{2}\right)x$,

so that

$\displaystyle Ax=A_{\infty}^{2}x-2A_{\infty}x+Ix=(I-A_{\infty})^{2}x$

Thus, $B=I-A_{\infty}$ is a positive square root of $A$.

Induction shows that each $A_{n}$ is self-adjoint, nonnegative, and commutes with $A$. Furthermore, induction shows that $A_{n}+A_{n-1}$ is polynomial in $C$ with positive coefficients and the observation

$\begin{array}{lcl}\displaystyle A_{n+1}-A_{n}=\dfrac{1}{2}\left(C+A_{n}^{2}\right)-\dfrac{1}{2}\left(C+A_{n-1}^{2}\right)&=&\displaystyle\dfrac{1}{2}A_{n}^{2}-A_{n-1}^{2}\\&=&\displaystyle\dfrac{1}{2}\left(A_{n}-A_{n-1}\right)\left(A_{n}+A_{n-1}\right)\end{array}$,

so that $A_{n+1}-A_{n}$ is polynomial in $C$ with positive coefficients. Hence, $A_{n+1}\geq A_{n}$, for every $n\geq 0$. I claim that $A_{n}\leq I$, for each $n\geq 0$. Clearly, $A_{0} \leq 0$. If $A_{n}\leq I$, then self-adjoint implies that $\left\|A_{n}\right\|\leq 1$, so that $A_{n}^{2}\leq \left\|A_{n}\right\|A_{n}\leq A_{n}$. Hence,

$\displaystyle A_{n+1}=\dfrac{1}{2}\left(C+A_{n}^{2}\right)\leq\dfrac{1}{2}\left(C+A_{n}\right)\leq I$

By the monotone convergence theorem, there exists a strong limit $A_{\infty}$. As shown above, $B^{2}=A$, where $B=I-A_{\infty}$, and since $A_{n}$ commutes with $A_{n}$ for each $n$, taking the limit shows that $A_{\infty}$, and therefore $B$, commutes with $A$ as well. $\Box$

We now derive some useful corollaries from Theorem 4, which have applications to the convergence of discrete-time martingales.

Corollary 5. Suppose $\left\{\mathbb{H}_{n}\right\}_{n=1}^{\infty}$ is an increasing sequence of closed subspaces of a Hilbert space $\mathbb{H}$. Then the projections $P_{n}$ onto $\mathbb{H}_{n}$ converge strongly to the projection $P_{\infty}$ onto the closed subspace $\mathbb{H}_{\infty}:=\overline{\bigcup_{n\geq 1}\mathbb{H}_{n}}$.

If $\mathbb{H}=L^{2}(\Omega,\mathcal{F},\mathbb{P})$ and $\mathbb{H}_{n}=L^{2}(\Omega,\mathcal{F}_{n},\mathbb{P})$, where $\left\{\mathcal{F}_{n}\right\}_{n=1}^{\infty}$ is a filtration, then $\mathbb{H}_{\infty}=L^{2}(\Omega,\mathcal{F}_{\infty},\mathbb{P})$, where $\mathcal{F}_{\infty}=\sigma\left(\bigcup_{n\geq 1}\mathcal{F}_{n}\right)$.

Proof. I first claim that $P_{n}\leq I$. Indeed,

$\displaystyle\langle{P_{n}x,x}\rangle=\langle{P_{n}x,(x-P_{n}x)+P_{n}x}\rangle=\left\|P_{n}x\right\|^{2}+\left\|x-P_{n}x\right\|^{2}=\left\|x\right\|^{2}$

By the monotone convergence theorem for self-adjoint adjoint operators, $P_{n}$ converges strongly to an operator $A_{\infty}:\mathbb{H}\rightarrow\mathbb{H}$. We now show that $A_{\infty}$ is the projection operator $P_{\infty}$.

Since $P_{n}^{2}=P_{n}$, $\left\|P_{n}\right\|\leq 1$ and therefore

$\displaystyle\left\|P_{n}^{2}x-P_{n}A_{\infty}x\right\|\leq\left\|P_{n}x-A_{\infty}x\right\|\rightarrow 0,n\rightarrow\infty$

we see that $A_{\infty}x=\lim_{n\rightarrow\infty}P_{n}x=\lim_{n\rightarrow\infty}P_{n}A_{\infty}x=A_{\infty}^{2}x$. Denote the range of $A_{\infty}$ by $\tilde{\mathbb{H}}_{\infty}$. Since $A_{\infty}^{2}=A_{\infty}$, we see that

$\displaystyle\lim_{n\rightarrow\infty}A_{\infty}x_{n}=y \Longrightarrow A_{\infty}y=\lim_{n\rightarrow\infty}A_{\infty}^{2}x_{n}=\lim_{n\rightarrow\infty}A_{\infty}x_{n}=y$

and therefore $\tilde{\mathbb{H}}_{\infty}$ is closed. Since $A_{\infty}$ is self-adjoint, $A_{\infty}$ is the orthogonal projection onto the subspace $\tilde{\mathbb{H}}_{\infty}$. We need to show that $\tilde{\mathbb{H}}_{\infty}=\mathbb{H}_{\infty}$. I claim that $\mathbb{H}_{n}\subset\tilde{\mathbb{H}}_{\infty}$ for each $n\geq 1$. Indeed, for $x\in \mathbb{H}_{n}$ and $m \geq n$, $P_{m}x=P_{n}x\in\mathbb{H}_{n}$, so $A_{\infty}x=P_{n}x\in\tilde{\mathbb{H}}_{\infty}$, since the latter space is closed. By definition of the closure, $\mathbb{H}_{\infty}\subset\tilde{\mathbb{H}}_{\infty}$. The reverse inclusion follows from observing that $x=\lim_{n\rightarrow\infty}P_{n}x$, where $P_{n}x\in\mathbb{H}_{n}$.

We know that $\mathbb{H}_{\infty}\subset L^{2}(\Omega,\mathcal{F}_{\infty},\mathbb{P})$ since the pointwise limit of $\mathcal{F}_{n}$-measurable functions is $\mathcal{F}_{\infty}$-measurable. For the reverse inclusion, I claim that $\mathcal{G}:=\left\{A\in\mathcal{F}:\mathbf{1}_{A}\in\mathbb{H}_{\infty}\right\}$ is a Dynkin system. Indeed, $\mathbf{1}_{\Omega}\in\mathbb{H}_{\infty}$. If $A\in\mathcal{G}$, then

$\displaystyle\mathbf{1}_{A^{c}}=\mathbf{1}_{\Omega}-\mathbf{1}_{A}\in\mathbb{H}_{\infty}\Longrightarrow A^{c}\in\mathcal{G}$

If $\left\{A_{n}\right\}_{n=1}^{\infty}\subset\mathcal{G}$ is a countable collection of pairwise disjoint sets, then by the dominated convergence theorem $\mathbf{1}_{A}=\sum_{n=1}^{\infty}\mathbf{1}_{A_{n}}\in \mathbb{H}_{\infty}$, where $A:=\bigcup_{n=1}^{\infty}A_{n}$. $\mathbb{H}_{n}\subset \mathbb{H}_{\infty}$, for each $n\geq 1$, so that $\bigcup_{n=1}^{\infty}\mathcal{F}_{n}\subset\mathcal{G}$. Since $\bigcup_{n=1}^{\infty}\mathcal{F}_{n}$ is closed under finite intersection, by Dynkin’s $\pi$$\lambda$ lemma (see my notes), $\mathcal{F}_{\infty}\subset\mathcal{G}$, which implies that $L^{2}(\Omega,\mathcal{F}_{\infty},\mathbb{P})\subset\mathbb{H}_{\infty}$. $\Box$

Corollary 6. Let $(\mathcal{F}_{n})_{n=1}^{\infty}$ be a filtration of a probability space $(\Omega,\mathcal{F},\mathbb{P})$. For $X\in L^{1}(\Omega,\mathcal{F},\mathbb{P})$, the sequence defined by $X_{n}:=\mathbb{E}[X\mid\mathcal{F}_{n}]$ converges in $L^{1}$-norm to $X_{\infty}:=\mathbb{E}[X\mid\mathcal{F}_{\infty}]$, where $\mathcal{F}_{\infty}:=\sigma\left(\bigcup_{n=1}^{\infty}\mathcal{F}_{n}\right)$.

Proof. Suppose $X\in L^{2}(\Omega,\mathcal{F}_{\infty})$. Then by Hölder’s inequality,

$\displaystyle\left\|X_{n}-X_{\infty}\right\|_{L^{1}}\leq\left\|X_{n}-X_{\infty}\right\|_{L^{2}} \rightarrow 0$,

by Corollary 5. Since $L^{2}(\Omega,\mathcal{F},\mathbb{P})$ is dense in $L^{1}(\Omega,\mathcal{F},\mathbb{P})$ and conditional expectation is a Markov operator with norm $1$, it follows that convergence holds on the entirety of $L^{1}(\Omega,\mathcal{F},\mathbb{P})$ (this is a standard $3\epsilon$-type argument, which I encourage the confused reader to work out). $\Box$