Fun with Generating Functions: Runs in Coin Flips

Well over a year ago, I uploaded some notes on using martingales to compute the expected waiting time for a pattern to appear in a sequence of independent, identically distributed trials. For example, if a monkey randomly hits a key on a typewriter every second, where all keys are equally probable, how long until the monkey types the string “abracadabra.” In today’s post, I want to consider the special case of tossing a coin until we achieve a run of r consecutive heads (or tails, by symmetry). We’re going to attack this problem combinatorally, using generating functions to solve a recurrence relation.

The problem inspiring this post is Theorem 1.19 in Anirban Dasgupta’s text Probability and Statistics for Machine Learning. Many thanks to the author.

Suppose a coin which comes up heads with probability p is tossed until r consecutive heads have been returned. Furthermore, suppose that each toss is independent of the other tosses. Denote the number of tosses by N=N_{r}. Then

\displaystyle\mathbf{E}[N]=\sum_{k=1}^{r}p^{-k}\text{ and }\text{Var}(N)=\dfrac{1-p^{1+2r}-qp^{r}(1+2r)}{q^{2}p^{2r}}

We introduce the notation p_{k}=\mathbf{P}(N=k), for k=0,1,\ldots. It is trivial to say that we must have at least r flips to obtain r heads; therefore, p_{0}=\cdots=p_{r-1}=0. If the first r coin flips are heads, then by our hypothesis that the coin flips are independent, we have that p_{r}=p^{r}

For general k>r, we see that N=k implies that the last r tosses are heads and the (k-r)^{th} toss is tails. Furthermore, there is no run of heads of length r in the first k-1 tosses. Denote the outcomes of the k tosses by the random variables X_{1},\ldots,X_{k}. We attack this problem by iteratively conditioning on the first r tosses X_{1},\ldots,X_{r}.

Observe that if the outcome of the j^{th} toss is tails and there is no run of r heads in X_{1},\ldots,X_{j-1}, then we start over in terms of waiting for a run of r heads. So, for 1\leq j\leq k-r,

\displaystyle p_{k}=\mathbf{P}\left(N=k\mid X_{j}=T\right)\mathbf{P}\left(X_{j}=T\right)=qp_{k-j}

where q=1-p is the probability of tails. Using the law of total probabiltiy, we see that

\begin{array}{lcl}\displaystyle p_{k}&=&\displaystyle\mathbf{P}\left(N=k\mid X_{1}=H\right)\mathbf{P}\left(X_{1}=H\right)+\mathbf{P}\left(N=k\mid X_{1}=T\right)\mathbf{P}\left(X_{1}=T\right)\\[2 em]&=&\displaystyle qp_{k-1}+p\mathbf{P}\left(N=k\mid X_{1}=H\right)\end{array}

To simplify the second term, we condition on the outcome of the second flip:

\begin{array}{lcl}\displaystyle p_{k}&=&\displaystyle qp_{k-1}+p\left[\mathbf{P}\left(N=k\mid X_{1}=H,X_{2}=H\right)\mathbf{P}(X_{2}=H)+\mathbf{P}\left(N=k\mid X_{1}=H,X_{2}=T\right)\mathbf{P}(X_{2}=T)\right]\\[2 em]&=&\displaystyle qp_{k-1}+pqp_{k-2}+p\mathbf{P}\left(N=k\mid X_{1}=H,X_{2}=H\right)\end{array}

We only need to condition on the first r flips because if X_{1}=\cdots=X_{r}=H, then

\displaystyle\mathbf{P}(N=k \mid X_{1}=\cdots=X_{r}=H)=\begin{cases} 1 & {k=r}\\ 0 & {k>r}\end{cases}

We conclude that

\displaystyle p_{k}=qp_{k-1}+pqp_{k-2}+\cdots+p^{r-1}qp_{k-r}

Our problem is now how to solve this recurrence relation. Well, we can solve it using generating functions. Since \sum p_{k}\leq1, the power series G(z)=\sum p_{k}z^{k} converges on the closed unit disk. We can write

\begin{array}{lcl}\displaystyle G(z)=p_{r}z^{r}+\sum_{k=r+1}^{\infty}p_{k}z^{k}&=&\displaystyle p_{r}z^{r}+\sum_{k=r+1}^{\infty}\left[qp_{k-1}+pqp_{k-2}+\cdots+p^{r-1}qp_{k-r}\right]z^{k}\\[2 em]&=&\displaystyle p_{r}z^{r}+qzG(z)+pqz^{2}G(z)+\cdots+p^{r-1}qz^{r}G(z)\end{array}

Rearranging, we can write G(z) as

\displaystyle G(z)\left[1-(qz+pqz^{2}+\cdots+p^{r-1}qz^{r})\right]=p_{r}z^{r}\Longrightarrow G(z)=\dfrac{p^{r}z^{r}}{1-qz(1+pz+\cdots+p^{r-1}z^{r-1})}

We can simplify the expression for G(z) by summing the geometric series 1+pz+\cdots+(pz)^{r-1} to obtain

\displaystyle G(z)=\dfrac{p^{r}z^{r}}{1-qz\frac{1-(pz)^{r}}{1-pz}}=\dfrac{p^{r}z^{r}(1-pz)}{1-pz-qz+qz(pz)^{r}}=\dfrac{p^{r}z^{r}(1-pz)}{1-z+qp^{r}z^{r+1}},

where we use the fact that p+q=1.

Observe that G(z) is a meromorphic function. Its only pole (i.e., where the denominator is zero) occurs where

\displaystyle1+qp^{r}z^{r}=0\Longleftrightarrow z=\dfrac{(-1)^{r}}{q^{\frac{1}{r}}p}

Since q^{\frac{1}{r}}p<1, we see that G is holomorphic on the open disk D(0;1+\varepsilon), for \varepsilon>0 sufficiently small. Hence, complex analysis tells us that the n^{th} derivative of G on the disk D(0;1+\varepsilon) is given by differentiating the power series term-by-term n times.

\displaystyle G'(z)=\dfrac{p^{r}rz^{r-1}-(r+1)p^{r+1}z^{r}}{1-z+qp^{r}z^{r+1}}-\dfrac{p^{r}z^{r}(1-pz)\left(-1+(r+1)qp^{r}z^{r}\right)}{(1-z+qp^{r}z^{r+1})^{2}}

Evaluating at z=1, we obtain that

\begin{array}{lcl}\displaystyle G'(1)=\dfrac{p^{r}r-(r+1)p^{r+1}}{qp^{r}}-\dfrac{p^{r}(1-p)(-1+(r+1)qp^{r})}{q^{2}p^{2r}}&=&\displaystyle\dfrac{r-(r+1)p}{q}-\dfrac{(1-p)(-1+(r+1)qp^{r})}{q^{2}p^{r}}\\[1 em]&=&\displaystyle\dfrac{r-(r+1)p}{q}+\dfrac{1}{qp^{r}}-(r+1)\\[2 em]&=&\displaystyle\dfrac{p^{-r}-1}{1-p}\\[2 em]&=&\displaystyle\sum_{k=1}^{r}p^{-k}\end{array}

Hence,

\displaystyle\mathbf{E}[N]=\sum_{k=r}^{\infty}kp_{k}=G'(1)=\sum_{k=1}^{r}p^{-k}

To compute the variance of N, we note that

\displaystyle G''(1)=\sum_{k=r}^{\infty}k(k-1)p_{k}=\mathbf{E}\left[N^{2}-N\right]

\begin{array}{lcl}\displaystyle G''(z)&=&\displaystyle\dfrac{\left[p^{r}rz^{r-1}-(r+1)p^{r+1}z^{r}\right]\left(-1+(r+1)qp^{r}z^{r}\right)}{\left(1-z+qp^{r}z^{r+1}\right)^{2}}-\dfrac{p^{r}r(r-1)z^{r-2}-(r+1)rp^{r+1}z^{r-1}}{1-z+qp^{r}z^{r+1}}\\[2 em]&+&\displaystyle\dfrac{2p^{r}z^{r}(1-pz)\left(-1+(r+1)qp^{r}z^{r}\right)^{2}}{(1-z+qp^{r}z^{r+1})^{3}}-\dfrac{(r+1)p^{r+1}z^{r}-rp^{r}z^{r-1}+2r(r+1)qp^{2r}z^{2r-1}-(2r+1)(r+1)p^{2r+1}qz^{2r}}{\left(1-z+qp^{r}z^{r+1}\right)^{2}}\end{array}

Evaluating at z=1, we see that

\begin{array}{lcl}\displaystyle G''(1)&=&\displaystyle-\dfrac{\left(p^{r}r-(r+1)p^{r+1}\right)\left(-1+(r+1)qp^{r}\right)}{q^{2}p^{2r}}+\dfrac{r(r-1)p^{r}-(r+1)rp^{r+1}}{qp^{r}}\\[2 em]&+&\displaystyle\dfrac{2p^{r}(1-p)\left(-1+(r+1)qp^{r}\right)^{2}}{q^{3}p^{3r}}-\dfrac{(r+1)p^{r+1}-rp^{r}+2r(r+1)qp^{2r}-(2r+1)(r+1)p^{2r+1}q}{q^{2}p^{2r}}\\[2 em]&=&\displaystyle-\dfrac{\left(r-p(r+1)\right)\left(-1+(r+1)qp^{r}\right)}{q^{2}p^{r}}-r\dfrac{p(r+1)-(r-1)}{q}\\[2 em]&+&\displaystyle2\dfrac{(-1+(r+1)qp^{r})^{2}}{q^{2}p^{2r}}-\dfrac{p(r+1)-r+2r(r+1)qp^{r}-(2r+1)(r+1)qp^{r+1}}{q^{2}p^{r}}\\[2 em]&=&\displaystyle2\dfrac{r-(r+1)p}{q^{2}p^{r}}-\dfrac{(r-p(r+1))(r+1)}{q}+r(r+1)-\dfrac{2r}{q}+2\dfrac{(-1+(r+1)qp^{r})^{2}}{q^{2}p^{2r}}-\dfrac{2r(r+1)-(2r+1)(r+1)p}{q}\\[2 em]&=&\displaystyle2\dfrac{r-(r+1)p}{q^{2}p^{r}}+\dfrac{(r+1)((r+1)p-r)}{q}-2r(r+1)+\dfrac{(r+1)p-2r}{q}+2\dfrac{(-1+(r+1)qp^{r})^{2}}{q^{2}p^{2r}}\\[2 em]&=&\displaystyle2\dfrac{(-1+(r+1)qp^{r})^{2}}{q^{2}p^{2r}}+2\dfrac{r-(r+1)p}{q^{2}p^{r}}-2r(r+1)+2\dfrac{(r+1)p-r}{q}\\[2 em]&=&\displaystyle\dfrac{2-4(r+1)qp^{r}+2(r+1)q^{2}p^{2r}+2rp^{r}-2(r+1)p^{r+1}+2(r+1)qp^{2r+1}-2rqp^{2r}}{q^{2}p^{2r}}\end{array}

Hence,

\begin{array}{lcl}\displaystyle\text{Var}(N)&=&G''(1)+\mathbf{E}[N]-\mathbf{E}^{2}[N]\\[2 em]&=&\displaystyle G''(1)+\dfrac{qp^{r}-qp^{2r}-1+2p^{r}-p^{2r}}{q^{2}p^{2r}}\\[2 em]&=&\displaystyle\dfrac{1-(4r+3)qp^{r}+2(r+1)p^{r}-2(r+1)p^{r+1}+2(r+1)q^{2}p^{2r}+2(r+1)qp^{2r+1}-(2r+1)qp^{2r}}{q^{2}p^{2r}-p^{2r}}\\[2 em]&=&\displaystyle\dfrac{1-(2r+1)qp^{r}+2(r+1)qp^{2r}-(2r+1)qp^{2r}-p^{2r}}{q^{2}p^{2r}}\\[2 em]&=&\displaystyle\dfrac{1+(q-1)p^{2r}-(2r+1)qp^{r}}{q^{2}p^{2r}}\\[2 em]&=&\displaystyle\dfrac{1-p^{2r+1}-(2r+1)qp^{r}}{q^{2}p^{2r}}\end{array}

Advertisements
This entry was posted in math.CO, math.PR, Problem Solving and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s