Markov Operators

I’ve been brushing up on my functional analysis as part of my broader study of probability theory by reading Adam Bobrowski’s text Functional Analysis for Probability and Stochastic Processes. Over the past couple days, I’ve been focused on the topic of Markov operators, which I would like to discuss at some length in today’s post. We will later use an extension theorem for Markov operators to prove the existence of conditional expectation for absolutely integrable random variables.

Let (\Omega,\mathcal{F},\mu) be a measure space, and let \mathbb{Y} be a (not necessarily closed) subspace of L^{1}(\Omega,\mathcal{F},\mu) which is dense in L^{1}(\Omega,\mathcal{F},\mu) and has the following property: x\in\mathbb{Y} implies that x^{+}=\max\left\{x,0\right\}\in\mathbb{Y}. Note that this requirement is the same as x\in\mathbb{Y} implies \left|x\right|\in\mathbb{Y}, since x^{+}=\frac{1}{2}[x+\left|x\right|].

Suppose P:\mathbb{Y}\rightarrow L^{1}(\Omega,\mathcal{F},\mu) is a linear operator such that Px \geq 0 and \int_{\Omega}Pxd\mu=\int_{\Omega}xd\mu, for all x\geq 0. Then there exists a unique extension of P to a bounded linear operator on \left\|P\right\|\leq 1 and Px \geq 0 for all x\geq 0 in L^{1}(\Omega,\mathcal{F},\mu).

Proof. For any x, we write x=x^{+}-x^{-}, where x^{+}:=\max\left\{x,0\right\} and x^{-}:=\max\left\{-x,0\right\}. For any x\in\mathbb{Y}, x-x^{+}=x^{-}\geq 0 and in \mathbb{Y}, hence P(x^{+})-P(x)\geq 0. Since P(x^{+}) \geq 0 by hypothesis,

\displaystyle(Px)^{+}\leq P(x^{+}),\indent\forall x\in\mathbb{Y}

Noting that x^{-}=(-x)^{+}, we have

\displaystyle(Px)^{-}=(-Px)^{+}=[P(-x)]^{+}\leq P(-x)^{+}=P(x^{-})


\begin{array}{lcl}\displaystyle\int_{\Omega}\left|Px\right|d\mu=\int_{\Omega}\left[(Px)^{+}+(Px)^{-}\right]d\mu&\leq&\displaystyle\int_{\Omega}\left[P(x^{+})+P(x^{-})\right]d\mu\\[.9 em]&=&\displaystyle\int_{\Omega}\left[x^{+}+x^{-}\right]d\mu\\[.9 em]&=&\displaystyle\int_{\Omega}\left|x\right|d\mu\end{array}

We see that P is a bounded linear operator on \mathbb{Y} with \left\|P\right\|\leq 1. Since \mathbb{Y} is dense in L^{1}(\Omega,\mathcal{F},\mu), P has a unique extension to a bounded linear operator defined on the entire space L^{1}(\Omega,\mathcal{F},\mu). We abuse notation and denote the extension also by P.

To see that P maps nonnegative x\in L^{1}(\Omega,\mathcal{F},\mu) to Px \geq 0, fix x\geq 0 and choose a sequence (x_{n})_{n=1}^{\infty} in \mathbb{Y} such that \left\|x-x_{n}\right\|_{L^{1}}\rightarrow 0 as n \rightarrow \infty. Since

\displaystyle x-x_{n}=(x-x_{n}^{+})\mathbf{1}_{\left\{x_{n}\geq 0\right\}}+(x+x_{n}^{-})\mathbf{1}_{\left\{x_{n}<0\right\}},

we see that \left\|x-x_{n}^{+}\right\|_{L^{1}}\leq\left\|x-x_{n}\right\|_{L^{1}}. Thus, Px =\lim_{n \rightarrow\infty}P(x_{n}^{+})\geq 0. \Box

Such a subspace \mathbb{Y} exists, since by definition of the Lebesgue integral, we can always take \mathbb{Y} to be the subspace of simple functions. If \mu is a finite measure, then we can take \mathbb{Y} to be the space L^{2}(\Omega,\mathcal{F},\mu).

The above result leads us to define a class of operators on L^{1}(\Omega,\mathcal{F},\mu), for a given measure space (\Omega,\mathcal{F},\mu). A linear operator P: L^{1}(\Omega,\mathcal{F},\mu)\rightarrow L^{1}(\Omega,\mathcal{F},\mu) is said to be a Markov operator if

  1. Px \geq 0 for all x \geq 0;
  2. \int_{\Omega}Pxd\mu=\int_{\Omega}x for x \geq 0.

Note that the second condition (together with the extension theorem) implies that P preserves the integral of x, for all x \in L^{1}(\Omega,\mathcal{F},\mu), since

\begin{array}{lcl}\displaystyle\int_{\Omega}Pxd\mu=\int_{\Omega}P(x^{+}-x^{-})d\mu=\int_{\Omega}\left[P(x^{+})-P(x^{-})\right]d\mu&=&\displaystyle\int_{\Omega}\left[x^{+}-x^{-}\right]d\mu\\[.9 em]&=&\displaystyle\int_{\Omega}xd\mu\end{array}

Thus, a Markov operator has operator norm \left\|P\right\|=1.

We now give some examples of Markov operators. Let our measure space be \mathbb{R} with the Lebesgue measure and \sigma-algebra, and for nonnegative x\in L^{1}(\mathbb{R}) such that \left\|x\right\|_{L^{1}}=1, define

\displaystyle P_{y}:L^{1}(\mathbb{R})\rightarrow L^{1}(\mathbb{R}),\indent P_{y}x:=y\ast x

It is evident that P_{y} is linear. If x\geq 0, then

\displaystyle(P_{y}x)(t)=\int_{\mathbb{R}}y(s)x(t-s)ds \geq 0,\indent\forall t\in\mathbb{R}

since y(s)x(t-s)\geq 0 for all s\in\mathbb{R} and t fixed. Lastly, by Fubini’s theorem and translation invariance,

\begin{array}{lcl}\displaystyle\int_{\mathbb{R}}(P_{y}x)(t)dt&=&\displaystyle\int_{\mathbb{R}}\left(\int_{\mathbb{R}}y(s)x(t-s)ds\right)dt\\[.9 em]&=&\displaystyle\int_{\mathbb{R}}y(s)\left(\int_{\mathbb{R}}x(t-s)dt\right)ds\\[.9 em]&=&\displaystyle\int_{\mathbb{R}}y(s)\left(\int_{\mathbb{R}}x(t)dt\right)ds\\[.9 em]&=&\left(\int_{\mathbb{R}}y(s)ds\right)\left(\int_{\mathbb{R}}x(t)dt\right)\\[.9 em]&=&\displaystyle\int_{\mathbb{R}}x(t)dt\end{array}

Thus, P_{y} is a Markov operator.

Now consider a measure space (\Omega,\mathcal{F},\mu) and a measurable space (\Omega',\mathcal{F}'), and let f: \Omega \rightarrow\Omega' be a measurable map. We seek a measure \mu' on (\Omega',\mathcal{F}') such that the operator

\displaystyle P:L^{1}(\Omega',\mathcal{F}',\mu')\rightarrow L^{1}(\Omega,\mathcal{F},\mu),\indent (Px)(\omega):=x(f(\omega))\forall\omega\in\Omega

is a Markov operator. I claim the desired measure \mu' is given by the pushforward measure f_{*}\mu. Indeed, for x\geq 0,

\displaystyle\int_{\Omega}Pxd\mu'=\int_{\Omega}x\circ f d\mu'=\int_{\Omega'}xd\mu

by the change-of-variables theorem.

We now extend the definition of Markov operators to the case where P maps equivalence classes of nonnegative integrable functions on the measure space (\Omega,\mathcal{F},\mu) to equivalence classes of nonnegative integrable functions on a possibly distinct measure space (\Omega',\mathcal{F}',\mu') and


for all nonnegative x\in L^{1}(\Omega,\mathcal{F},\mu).

Let (\Omega,\mathcal{F},\mathbb{P}) be a probability space on which a countable collection \left\{X_{n}\right\}_{n=1}^{\infty} of i.i.d. \text{Exp}(a) random variables. For each n\in\mathbb{Z}^{\geq 1}, define a random variable S_{n} := \sum_{j=1}^{n}X_{j}. The reader can verify that S_{n}\sim\text{Gamma}(n,a). For any x\in L^{1}(\mathbb{R}^{+}), define


Let \lambda denote the Lebesgue measure. I claim that P is a Markov operator with domain L^{1}(\mathbb{R}^{+},\mathcal{B}(\mathbb{R}^{+}),a\lambda). Suppose x,x' are nonnegative and x=x' except on a set E of measure zero. Let E' denote the set of \omega\in\Omega such that (Px)(\omega)\neq (Px')(\omega). Since the law of S_{n} is absolutely continuous with respect to Lebesgue measure, \mathbb{P}(S_{n}\in E)=0 for each n. Hence, \bigcup_{n=1}^{\infty}\left\{S_{n}\in E\right\} is an event with probability zero. For any \omega \in \Omega\setminus \bigcup_{n=1}^{\infty}\left\{S_{n}\in E\right\}, (Px)(\omega)=(Px')(\omega). Hence, E' is an event of probability zero.

For nonnegative x, we have by the monotone convergence theorem and change-of-variables theorem that


This entry was posted in math.FA and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s