Martingales, Optional Stopping, and An Inequality of Khintchine and Kolmogorov

In today’s post, I want to show how we can prove inequalities for sums of i.i.d. random variables (i.e. random walks) using martingale arguments. The inequality I want to (ultimately) consider was first proven by Khintchine and Kolmogorov in 1925. I don’t know the argument of their proof, but given that this was before Doob’s monumental work on martingales was published, I do think it’s different than that presented here. Here is their result:

If \left\{X_{j}\right\}_{j=1}^{n} are mean-zero i.i.d. random variables essentially bounded by a constant B, then

\displaystyle\mathbb{P}\left(\max_{1\leq j\leq n}\left|S_{j}\right|\leq\lambda\right)\leq\dfrac{(B+\lambda)^{2}}{\text{Var}(S_{n})}, \indent \forall \lambda > 0,\forall n \geq 1

We first will review a few general results about martingales and stopping times.

I assume that the reader is familiar with Doob’s optional stopping theorem for martingales:

Theorem 1. Suppose \left\{X_{n}\right\}_{n=1}^{\infty} is a martingale adapted to a filtration \left\{\mathcal{F}_{n}\right\}_{n=0}^{\infty}, and T,S are stopping times adapted to this filtration. If \mathbb{P}(S\leq T) = 1, then

\displaystyle\mathbb{E}[X_{T}\mid\mathcal{F}_{S}]=X_{S}\text{ a.s.}

where \mathcal{F}_{S} is the \sigma-algebra defined by

\displaystyle A \in \mathcal{F}_{S}\Longleftrightarrow A\in\mathcal{F},A\cap\left\{\tau\leq k\right\}\in\mathcal{F}_{k} \ \forall k\in\mathbb{Z}^{\geq 1}

I will include a proof of this result in a subsequent post of martingales, but for now, we will use it without proof. In what follows, all stopping times are assumed to be adapted to the filtration generated by the given martingale.

Recall the definition of a random walk: if \left\{X_{j}\right\}_{j=1}^{n} is a collection of i.i.d. random variables with values in \mathbb{R}^{m}, then \left\{S_{n}\right\}_{n=1}^{\infty} is a random walk, where S_{n}:=\sum_{j=1}^{n}X_{j}. We can use random walks to define martingales which allow computation of the moments of stopped random walks, as the next lemma shows.

Lemma 2. Suppose \left\{X_{j}\right\}_{j=1}^{\infty} is a collection of i.i.d. real-valued random variables so that S_{n} := \sum_{j=1}^{n}X_{j} is a random walk. If X_{1} \in L^{1}, then \left\{S_{n}-n\mathbb{E}[X_{1}]\right\}_{n=1}^{\infty} is a martingale. If X_{1}\in L^{2} and \mathbb{E}[X_{1}]=0, then \left\{S_{n}^{2}-n\text{Var}(X_{1})\right\}_{n=1}^{\infty} is a martingale.

Proof. For n \geq 1, we have by the independence of the X_{j} that

\begin{array}{lcl}\displaystyle\mathbb{E}\left[S_{n+1}-(n+1)\mathbb{E}[X_{1}]\mid S_{1},\cdots,S_{n}\right]&=&\mathbb{E}[X_{n+1}]-(n+1)\mathbb{E}[X_{1}]+S_{n}\\&=&S_{n}-n\mathbb{E}[X_{1}]\\\end{array}

since \mathbb{E}[X_{n+1}]=\mathbb{E}[X_{1}]. Analogously,

\begin{array}{lcl}\displaystyle\mathbb{E}\left[S_{n+1}^{2}-(n+1)\text{Var}(X_{1})\mid S_{1},\cdots,S_{n}\right]&=&\mathbb{E}\left[(X_{n+1}+S_{n})^{2}-(n+1)\text{Var}(X_{1})\mid S_{1},\cdots,S_{n}\right]\\&=&\text{Var}(X_{n+1})+2\mathbb{E}[X_{n+1}]S_{n}+S_{n}^{2}-(n+1)\text{Var}(X_{1})\\&=&S_{n}^{2}-n\text{Var}(X_{1})\end{array}

since \mathbb{E}[X_{n+1}]=\mathbb{E}[X_{1}]=0 and \text{Var}(X_{n+1})=\text{Var}(X_{1}). \Box

We now use Doob’s optional stopping theorem to prove Wald’s first and second identities.

Lemma 3. If \left\{S_{n}\right\}_{n=1}^{\infty} is a random walk with X_{1}\in L^{1}. If \mathcal{F}_{n}:=\sigma(X_{1},\cdots,X_{n}) and T is a stopping time with \mathbb{E}[T]<\infty, then

\displaystyle\mathbb{E}[S_{T}]=\mathbb{E}[X_{1}]\mathbb{E}[T]

If  \mathbb{E}[X_{1}] = 0 and \text{Var}(X_{1})=\sigma^{2}<\infty, then

\displaystyle \mathbb{E}[S_{T}^{2}]=\text{Var}(S_{1})\mathbb{E}[T]

Proof. It is an easy exercise for the reader to verify that if \left\{Z_{j}\right\}_{j=1}^{\infty} is a martingale and T is a stopping time, then T \wedge n is also a stopping time and \left\{Z_{T\wedge j}\right\}_{j=1}^{\infty} is also martingale. By Doob’s optional stopping theorem with S \equiv 1,

\displaystyle\mathbb{E}\left[S_{T\wedge n}\right]-\mathbb{E}[T\wedge n]\mathbb{E}[X_{1}]=\mathbb{E}[\mathbb{E}\left[S_{T\wedge n}-(T\wedge n)\mathbb{E}[X_{1}]\mid\mathcal{F}_{1}\right] = 0

By the dominated convergence theorem, \mathbb{E}[T \wedge n] \uparrow \mathbb{E}[T] as n\rightarrow\infty. For all n\geq 1, observe that

\displaystyle\left|S_{T\wedge n}\right|=\left|\sum_{k=1}^{\infty}\mathbf{1}_{T\wedge n\geq k}X_{k}\right|\leq\sum_{k=1}^{\infty}\left|X_{k}\right|\mathbf{1}_{T \geq k}

Applying the dominated convergence theorem again, we can interchange summation and expectation to obtain

\begin{array}{lcl}\displaystyle\mathbb{E}\left[S_{T\wedge n}\right]\leq\sum_{k=1}^{\infty}\mathbb{E}\left[X_{k}\mathbf{1}_{T\geq k}\right]&=&\displaystyle\sum_{k=1}^{\infty}\mathbb{E}\left[\mathbb{E}\left[X_{k}\mathbf{1}_{T\geq k}\mid\mathcal{F}_{k}\right]\right]\\&=&\displaystyle\sum_{k=1}^{\infty}\mathbb{E}\left[\mathbf{1}_{T\geq k}\right]\mathbb{E}\left[\mathbb{E}\left[X_{k}\mid\mathcal{F}_{k}\right]\right]\\&=&\displaystyle\sum_{k=1}^{\infty}\mathbb{E}[X_{k}]\mathbb{P}(T\geq k)\\&=&\displaystyle\mathbb{E}[X_{1}]\mathbb{E}[T]\end{array},

where we use the fact that \left\{T \geq k\right\} = \left\{T \leq k-1\right\}^{c}\in\mathcal{F}_{k-1}. Letting n\rightarrow\infty, the hypothesis that T is finite a.s., and applying the dominated convergence theorem completes the proof.

To prove Wald’s second identity, we first note that by Lemma 2,

\displaystyle\mathbb{E}[(S_{T\wedge n})^{2}] = \mathbb{E}[T\wedge n]\text{Var}(X_{1})

To complete the proof, it suffices by the dominated convergence theorem to show that

\displaystyle\mathbb{E}\left[\sup_{n}(S_{T\wedge n})^{2}\right] \leq \mathbb{E}[T]\text{Var}(X_{1})

For any n \geq 1,

\begin{array}{lcl}\displaystyle(S_{T\wedge n})^{2}=\left(\sum_{k=1}^{\infty}X_{k}\mathbf{1}_{\left\{T\wedge n\geq k\right\}}\right)^{2}&=&\displaystyle\sum_{k=1}^{\infty}\left|X_{k}\right|^{2}\mathbf{1}_{\left\{T\wedge n\geq k\right\}}+\sum_{j\neq k}X_{j}X_{k}\mathbf{1}_{\left\{T\wedge n\geq j\vee k\right\}}\\&\leq&\displaystyle\sum_{k=1}^{\infty}\left|X_{k}\right|^{2}\mathbf{1}_{\left\{T\geq k\right\}}+\sum_{j\neq k}X_{j}X_{k}\mathbf{1}_{\left\{T\wedge n \geq j\vee k\right\}}\end{array}

Taking expectation of both sides of the inequality, we have by the independence of the X_{j} that \mathbb{E}[X_{j}X_{k}]=0, so applying the dominated convergence theorem,

\begin{array}{lcl}\displaystyle\mathbb{E}\left[(S_{T\wedge n})^{2}\right]\leq\sum_{k=1}^{\infty}\mathbb{E}\left[X_{k}^{2}\mathbf{1}_{\left\{T\geq k\right\}}\right]&=&\displaystyle\sum_{k=1}^{\infty}\mathbb{E}\left[\mathbb{E}\left[X_{k}^{2}\mathbf{1}_{\left\{T\geq k\right\}}\mid\mathcal{F}_{k}\right]\right]\\&=&\displaystyle\sum_{k=1}^{\infty}\mathbb{E}[X_{k}^{2}]\mathbb{P}(T\geq k)\\&=&\displaystyle\text{Var}(X_{1})\sum_{k=1}^{\infty}\mathbb{P}(T\geq k)\\&=&\displaystyle\text{Var}(X_{1})\mathbb{E}[T]\end{array}

\Box

We can use Wald’s identities to prove Kolmogorov’s and Khintchine’s inequality.

Proof. Fix \lambda > 0 and define a random time by

\displaystyle T:=\inf\left\{j\geq 1: \left|S_{j}\right|>\lambda\right\},

where \inf\emptyset := \infty. It is clear that \left\{T \leq k\right\} \in \mathcal{F}_{k}, for every k \geq 1. Note that

\displaystyle\left\{\tau \geq n+1\right\}=\left\{\max_{1\leq j\leq n}\left|S_{j}\right|\leq\lambda\right\}

Since T\wedge n+1 is a bounded stopping time, we have by Wald’s second identity that

\displaystyle\mathbb{E}\left[\left(S_{T\wedge n+1}\right)^{2}\right]=\mathbb{E}\left[T\wedge (n+1)\right]\text{Var}(X_{1})\geq (n+1)\mathbb{P}\left(T\geq n+1\right)\text{Var}(X_{1})

For the LHS, we can write

\begin{array}{lcl}\displaystyle\mathbb{E}\left[(S_{T\wedge n+1})^{2}\right]&=&\displaystyle\mathbb{E}\left[\sum_{j=1}^{n}(S_{T\wedge n+1})^{2}\mathbf{1}_{\left\{T=j\right\}}+(S_{T\wedge n+1})^{2}\mathbf{1}_{\left\{T\geq n+1\right\}}\right]\\&=&\displaystyle\sum_{j=1}^{n}\mathbb{E}\left[(S_{T\wedge n+1})^{2}\mathbf{1}_{\left\{T=j\right\}}\right]+\mathbb{E}\left[(S_{T\wedge n+1})^{2}\mathbf{1}_{\left\{T\geq n+1\right\}}\right]\end{array}

Observe that (T\wedge n+1)(\omega) = j>1 implies that \left|S_{j-1}(\omega)\right|\leq \lambda, so that

\displaystyle\left|S_{j}(\omega)\right|\leq\left|S_{j-1}(\omega)\right|+\left|X_{j}(\omega)\right|\leq\lambda+B

Squaring both sides, we have that \left|S_{j}(\omega)\right|^{2}\leq(\lambda+B)^{2}. Substituting this upper bound into the equation above, we conclude that

\begin{array}{lcl}\displaystyle\mathbb{P}\left(\max_{1\leq j\leq n}\left|S_{j}\right|\leq\lambda\right)=\displaystyle\mathbb{P}\left(T\geq n+1\right)&\leq&\displaystyle\dfrac{(\lambda+B)^{2}}{(n+1)\text{Var}(X_{1})}\\&\leq&\displaystyle\dfrac{(\lambda+B)^{2}}{n\text{Var}(X_{1})}\\&=&\displaystyle\dfrac{(\lambda+B)^{2}}{\text{Var}(S_{n})}\end{array}

since \text{Var}(S_{n}) = \text{Var}(X_{1})+\cdots+\text{Var}(X_{n})=n\text{Var}(X_{1}) by independence of the X_{j}. \Box

Advertisements
This entry was posted in math.PR and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s