## Probability Mixtures and the Pareto Distribution

Suppose $(\Omega, \mathcal{F})$ and $(\Theta,\mathcal{G})$ are two measure spaces. Assume that $\nu$ is a probability measure on $\mathcal{G}$ and for each $\theta \in \Theta$, $\mathbb{P}_{\theta}$ is a probability measure on $\mathcal{F}$. If the function

$\displaystyle\Omega \rightarrow [0,1], \ \theta \mapsto \mathbb{P}_{\theta}(A), \indent \forall A \in \mathcal{F}$

is $\mathcal{G}$-measurable for each $A \in \mathcal{F}$, then

$\mu(A) := \displaystyle\int_{\Theta}\mathbb{P}_{\theta}(A)\nu(d\theta), \indent A \in \mathcal{F}$

defines a probability measure $\mu$ on $(\Omega,\mathcal{F})$.

Proof. Since $0 \leq \mathbb{P}_{\theta} \leq 1$ for all $\theta \in \Theta$, it is immediate that $0 \leq \mu \leq 1$. Moreover,

$\displaystyle\mu(\emptyset)=\int_{\Theta}\mathbb{P}_{\theta}(\emptyset)\nu(d\theta)=\int_{\Theta}0\nu(d\theta) = 0$

and

$\displaystyle\mu(\Omega)=\int_{\Theta}\mathbb{P}_{\theta}(\Omega)\nu(d\theta)=\int_{\Theta}1\nu(d\theta)=\nu(\Theta) = 1$

Let $\left\{A_{k}\right\}_{k=1}^{\infty}$ be a countable collection of disjoint measurable subsets of $\mathcal{F}$. Set $A := \bigcup_{k=1}^{\infty}$. By the $\sigma$-additivity of $\mathbb{P}_{\theta}$ for each $\theta \in \Theta$ and the dominated convergence theorem,

$\begin{array}{lcl}\mu(A) = \displaystyle\int_{\Theta}\mathbb{P}_{\theta}(A)\nu(d\theta)=\int_{\Theta}\sum_{k=1}^{\infty}\mathbb{P}_{\theta}(A_{k})\nu(d\theta)&=&\displaystyle\sum_{k=1}^{\infty}\int_{\Theta}\mathbb{P}_{\theta}(A_{k})\nu(d\theta)\\&=&\displaystyle\sum_{k=1}^{\infty}\mu(A_{k})\end{array}$

$\Box$

We say that the probability measure $\mu$ is a mixture of the $\mathbb{P}_{\theta}$‘s with mixing distribution $\nu$.

We now show how well-known distributions can be obtained from the mixture of other well-known distributions. Recall that a random variable $X$ is said to have exponential distribution with parameter $\lambda >0$ if it has a probability density function (pdf)

$\displaystyle f_{X}(x)=\begin{cases} \lambda e^{-\lambda x} & {x \geq 0}\\ 0 & {x < 0} \end{cases}$

A random variable $X$ is said to have a gamma distribution with parameters $\alpha > 0, \beta > 0$, if it has pdf

$\displaystyle f_{X}(x)=\begin{cases}\frac{1}{\Gamma(\alpha)\beta^{\alpha}}x^{\alpha-1}e^{-\frac{x}{\beta}} & {x \geq 0} \\ 0 & {x < 0}\end{cases}$

Lastly, a random variable $X$ has a Pareto distribution with parameters $\alpha > 0, \gamma > 0$ if it has pdf

$\displaystyle f_{X}(x)=\begin{cases}\frac{\alpha \gamma^{\alpha}}{(x+\gamma)^{\alpha+1}} & {x\geq 0} \\ 0 & {x < 0}\end{cases}$

We can obtain the Pareto distribution as a mixture of exponential distributions with gamma mixing measure. Let $X \sim \text{Exp}(\lambda)$ and $\Lambda \sim \text{Gamma}(\alpha,\beta)$. Then for any $x \geq 0$, we have by Fubini’s theorem that

$\begin{array}{lcl}\mu([0,x])= \displaystyle\frac{1}{\beta^{\alpha}\Gamma(\alpha)}\int_{0}^{\infty}\left(\int_{0}^{x}\lambda e^{-\lambda t}dt\right)\lambda^{\alpha-1}e^{-\frac{\lambda}{\beta}}d\lambda &=&\displaystyle\frac{1}{\beta^{\alpha}\Gamma(\alpha)}\int_{0}^{\infty}\left(1-e^{-\lambda x}\right)\lambda^{\alpha-1}e^{-\frac{\lambda}{\beta}}d\lambda\\&=&\displaystyle 1 \displaystyle-\frac{1}{\Gamma(\alpha)}\int_{0}^{\infty}e^{-x \beta y}y^{\alpha-1}e^{-y}dy\\&=&1-\displaystyle\frac{1}{\Gamma(\alpha)}\int_{0}^{\infty}y^{\alpha-1}e^{-(x\beta+1)y}dy\\&=&1 -\displaystyle\frac{1}{(x\beta+1)^{\alpha}\Gamma(\alpha)}\int_{0}^{\infty}z^{\alpha-1}e^{-z}dz\\&=&1-\displaystyle\frac{1}{(x\beta+1)^{\alpha}}\end{array}$

If $Y$ is a random variable with distribution $\mu$, then differentiating with respect to $x$, we obtain that $Y$ has pdf

$\displaystyle f_{Y}(x)=\frac{\alpha\beta}{(x\beta+1)^{\alpha+1}}=\frac{\alpha\beta^{-\alpha}}{(x+\beta^{-1})^{\alpha+1}}$

Hence, $Y$ is Pareto with parameters $\beta^{-1}$ and $\alpha$.