## Cauchy Functional Equation II: Memorylessness

As promised in my last post, we will use the result that the only solutions to the functional equation

$\displaystyle f(t+s)=f(t)+f(s),\indent\forall t,s\in[0,\infty)$

are of the form $f(t):=f(1)t$ to show that the only continuous distribution with the so-called “memorylessness” property is the exponential distribution. Even though the analogous result for discrete distributions does not make use of the results of my last post, it is instructive to begin by showing that the geometric distribution is the unique memoryless discrete distribution.

Recall that a random variable $X$ is said to be geometrically distributed with parameter $p\in (0,1)$, which we denote by $X\sim\text{Geom}(p)$, if its probability mass function is of the form

$\displaystyle\mathbb{P}(X=k)=(1-p)^{k}p,\indent k\in\mathbb{Z}^{\geq 0}$

If we have a collection of independent Bernoulli trials with probability $p$, the geometric distribution models the number of “failures” before the first “success” is achieved.

If $X\sim\text{Geom}(p)$, where $p \in (0,1)$, then for any integers $n,m\geq 0$,

$\begin{array}{lcl}\displaystyle\mathbb{P}(X > n+m \mid X \geq n) = \dfrac{\mathbb{P}(X > n+m, X \geq n)}{\mathbb{P}(X \geq n)}&=&\displaystyle\dfrac{\mathbb{P}(X > n+m)}{\mathbb{P}(X \geq n)}\\[.9 em]&=&\displaystyle\dfrac{\sum_{k=n+m+1}^{\infty}(1-p)^{k}p}{\sum_{k=n}^{\infty}(1-p)^{k}p}\\[.9 em]&=&\displaystyle\dfrac{(1-p)^{n+m+1}p\frac{1}{1-(1-p)}}{(1-p)^{n}p\frac{1}{1-(1-p)}}\\[.9 em]&=&\displaystyle(1-p)^{m+1}\\[.9 em]&=&\displaystyle\mathbb{P}(X >m)\end{array}$

This identity is referred to as the “memorylessness” property of the Geometric distribution. If we have observed at least $n$ failures, then the probability that we observe more than $n+m$ failures before achieving the first success is just the probability that we observe at least $m+1$ more failures before achieving the first success. Set $p:=\mathbb{P}(X =0)$. Then

$\displaystyle\mathbb{P}(X>1+n \mid X \geq n)=\mathbb{P}(X> 1)=1-p$

Suppose we have shown that $\mathbb{P}(X=k)=p(1-p)^{k}$, for all $0\leq k < m$. Then

$\begin{array}{lcl}\displaystyle\mathbb{P}(X > m)=\mathbb{P}(X>1+(m-1)\mid X \geq 1)\mathbb{P}(X\geq 1)&=&\displaystyle\mathbb{P}(X>m-1)(1-p)\\[.9 em]&=&\displaystyle(1-p)\left[1-\mathbb{P}(X\leq m-1)\right]\\[.9 em]&=&\displaystyle(1-p)\left[1-\sum_{k=0}^{m-1}(1-p)^{k}p\right]\\[1.1 em]&=&\displaystyle(1-p)\left[1-p\dfrac{1-(1-p)^{m-1+1}}{1-(1-p)}\right]\\[.9 em]&=&\displaystyle(1-p)^{m+1}\end{array}$

Hence,

$\begin{array}{lcl}\displaystyle\mathbb{P}(X=m)=1-\mathbb{P}(X > m)-\mathbb{P}(X\leq m-1)&=&\displaystyle 1-(1-p)^{m+1}-\left[1-(1-p)^{m}\right]\\[.9 em]&=&\displaystyle(1-p)^{m}\left[1-(1-p)\right]\\[.9 em]&=&\displaystyle p(1-p)^{m}\end{array}$

So $X$ is necessarily geometrically distributed with parameter $p$.

A random variable $X$ is said to be exponentially distributed with parameter $\lambda >0$, which we denote by $X\sim\text{Exp}(\lambda)$, if it has a probability density function $f$ of the form

$\displaystyle f(x):=\begin{cases}\lambda e^{-\lambda x}&{x\geq 0}\\ 0&{x<0}\end{cases}$

The memorylessness property refers to the following result for $X\sim\text{Exp}(\lambda)$:

$\displaystyle\mathbb{P}(X>t+s\mid X>t)=\dfrac{\int_{t+s}^{\infty}\lambda e^{-\lambda x}dx}{\int_{t}^{\infty}\lambda e^{-\lambda x}dx}=\dfrac{e^{-\lambda(t+s)}}{e^{-\lambda t}}=e^{-\lambda s}=\mathbb{P}(X>s)$

If $X$ denotes the time until a mechanical device fails (e.g. a flashlight), then the memorylessness property tells us that the probability that the device lasts more than 2 years, given that it’s lasted more than 1 is year, is just the probability that it lasts for at least another year.

Now suppose that $X$ is an a.s. finite random variable with probability distribution such that singletons have probability zero, $\mathbb{P}(X>t+s\mid X > t)$ exists for all $t,s \in [0,+\infty)$ and

$\displaystyle\mathbb{P}(X>t+s\mid X>t)=\mathbb{P}(X>s)$

Define the survival function $G: [0,\infty) \rightarrow [0,+\infty)$ by $G(t):=\mathbb{P}(X>t)$. Since the cummulative distribution function is right-continuous, it follows that $G$ is also right-continuous, hence measurable. I claim that $G(t) > 0$ for all $t\in \mathbb{R}$. Set $t:=\inf\left\{s > 0: \mathbb{P}(X > s)=0\right\}$. $\mathbb{P}(X > t)=0$ by right-continuity and

$\displaystyle\mathbb{P}\left(X>t\mid X>\frac{t}{2}\right)=\mathbb{P}\left(X>\frac{t}{2}+\frac{t}{2}\mid X>\frac{t}{2}\right)=\mathbb{P}\left(X>\frac{t}{2}\right) > 0$

But then

$\displaystyle0t\mid X>\frac{t}{2}\right)\mathbb{P}\left(X>\frac{t}{2}\right)=\mathbb{P}(X>t)=0$,

which is a contradiction. By the definition of conditional probability, we see that $G$ satisfies the functional equation $G(t+s)=G(t)G(s)$ for all $t,s\geq 0$. Hence $\tilde{G}(t):=\log(G(t))$ satisfies the functional equation

$\displaystyle\tilde{G}(t+s)=\tilde{G}(t)+\tilde{G}(s),\indent\forall t,s\geq 0$

By our uniqueness result, there exists $\alpha' \in \mathbb{R}$ such that $\tilde{G}(t)=\alpha t$. Since $G(t)$ is monotonically nonincreasing and $\lim_{t\rightarrow\infty}G(t)=0$, we see that $\alpha< 0$. Hence,

$\displaystyle\mathbb{P}\left(X > t\right)=G(t)=e^{\tilde{G}(t)}=e^{-\lambda t},$

where $\lambda :=-\alpha< 0$. This is precisely the survival function of a random variable $X\sim\text{Exp}(\lambda)$.