## Proving Stone-Weierstrass without Weierstrass

As readers of my blog may know, I am very fond of Walter Rudin’s text Principles of Mathematical Analysis (a.k.a. “Baby Rudin”), in particular the exercises. Full disclosure: frustration, not fondness, was my first emotion working with this text. I read Baby Rudin periodically, usually to be surprised by all the things that I missed during my last round of reading.

One of the big “principles” of analysis, to borrow Rudin’s language, is uniform approximation of real- or complex-valued continuous functions defined on compact metric spaces by “classes” of “nice” functions, such as polynomials. In the second half of the 19th century, German mathematician Karl Weierstrass showed that the real polynomials are uniformly dense in the space of continuous functions on the closed unit interval.

Theorem (Weierstrass) Suppose $f:[-1,1]\rightarrow\mathbb{R}$ is a continuous function. For any $\varepsilon>0$ given, there exists a polynomial $p\in\mathbb{R}[x]$ such that

$\displaystyle\left|f(x)-p(x)\right|<\varepsilon,\indent\forall x\in[-1,1]$

There are a few different proofs of Weierstrass theorem (Theorem 7.26). One particularly interesting–and constructive–proof uses Bernstein polynomials and the weak law of large numbers (WLLN). Said proof can be found in my notes on laws of large numbers. Note that this is not the proof presented in Baby Rudin. In the 1930s, American mathematician Marshall H. Stone, the son of a U.S. Supreme Court Chief Justice, substantially generalized Weierstrass’ result both in terms of the underlying domain of the continuous functions and the class of nice functions. Formulating Stone’s result requires some more advanced machinery, which we introduce here.

We say that a collection $\mathcal{A}$ of functions on a set $E$ separates points on $E$ if, for all distinct points $x_{1},x_{2}\in E$, there exists a function $f\in\mathcal{A}$ such that $f(x_{1})\neq f(x_{2})$. If, for every point $x\in E$, there exists a function $f\in\mathcal{A}$ such that $f(x)\neq0$. We say that $\mathcal{A}$ is an algebra if it satisfies the following conditions:

1. $f+g\in\mathcal{A}$;

2. $f\cdot g\in\mathcal{A}$;

3. $cf\in\mathcal{A}$

for all $f,g\in\mathcal{A}$ and $c\in\mathbb{C}$

Theorem (Stone) Suppose $\mathcal{A}$ is an algebra of real-valued continuous functions defined on a compact metric space $K$, such that $\mathcal{A}$ separates points on $K$ and vanishes at no point of $K$. Then, for any continuous function $f: K\rightarrow\mathbb{R}$ and $\varepsilon>0$ given, there exists a function $g\in\mathcal{A}$ such that

$\displaystyle\left|f(x)-g(x)\right|<\varepsilon,\indent\forall x\in K$

The proof that Rudin presents of the above version of Stone’s theorem (Theorem 7.32) involves showing that, for any $\varepsilon>0$, there exists a polynomial $p\in\mathbb{R}[x]$ such that

$\displaystyle\left|p(x)-\left|x\right|\right|<\varepsilon,\indent\forall x\in[-a,a]$

where $a>0$. This is an immediate consequence of Weierstrass’ theorem, since $x\mapsto\left|x\right|$ is a continuous function; however, one does not actually need the full strength of Weierstrass’ theorem. Indeed, showing this claim is Exercise 23 in Chapter 7 of Baby Rudin.

Set $P_{0}=0$. We define a sequence of polynomials inductively by

$\displaystyle P_{n+1}(x):=P_{n}(x)+\dfrac{x^{2}-P_{n}^{2}(x)}{2},\indent\forall n\in\mathbb{N}$

We will show that $P_{n}(x)\rightarrow\left|x\right|$, as $n\rightarrow\infty$, uniformly on the interval $[-1,1]$. Observe that

$\begin{array}{lcl}\displaystyle\left|x\right|-P_{n+1}(x)=\left|x\right|-P_{n}(x)-\dfrac{x^{2}-P_{n}^{2}(x)}{2}&=&\displaystyle\left|x\right|-P_{n}(x)-\left[\left|x\right|-P_{n}(x)\right]\dfrac{\left|x\right|+P_{n}(x)}{2}\\[.9 em]&=&\displaystyle\left[\left|x\right|-P_{n}(x)\right]\left[1-\dfrac{\left|x\right|+P_{n}(x)}{2}\right]\end{array}$

I claim that $0\leq P_{n}(x)\leq P_{n+1}(x)\leq\left|x\right|$ on $[-1,1]$. Since

$\displaystyle\left|x\right|-P_{1}(x)=\left|x\right|-\dfrac{\left|x\right|^{2}}{2}=\left|x\right|\left[1-\dfrac{\left|x\right|}{2}\right]\geq\dfrac{\left|x\right|}{2}\geq0$,

for all $\left|x\right|\leq1$. So the inequalities $0\leq P_{0}(x)\leq P_{1}(x)\leq\left|x\right|$ hold on $[-1,1]$. Suppose that the inequalities $0\leq P_{n}(x)\leq P_{n+1}(x)\leq\left|x\right|$ hold on $[-1,1]$, for some positive integer $n$. Then

$\begin{array}{lcl}\displaystyle P_{n+2}(x)-P_{n+1}(x)=P_{n+1}(x)+\dfrac{x^{2}-P_{n+1}^{2}(x)}{2}-P_{n+1}(x)&=&\displaystyle\dfrac{x^{2}-P_{n+1}^{2}(x)}{2}\\[.9 em]&\geq&\displaystyle\dfrac{x^{2}-x^{2}}{2}\\[.9 em]&=&\displaystyle0,\end{array}$

by our induction hypothesis. Furthermore,

$\begin{array}{lcl}\displaystyle\left|x\right|-P_{n+2}(x)=\underbrace{\left[\left|x\right|-P_{n+1}(x)\right]}_{\geq 0}\left[1-\dfrac{\left|x\right|+P_{n+1}(x)}{2}\right]&\geq&\displaystyle\left[\left|x\right|-P_{n+1}(x)\right]\left[1-\left|x\right|\right]\\[.9 em]\displaystyle&\geq&0,\end{array}$

for all $\left|x\right|\leq1$. Using induction, the nonnegativity of $P_{n}(x)$, and the inequality $\left|x\right|-P_{n+1}(x)\leq[\left|x\right|-P_{n}(x)](1-\frac{\left|x\right|}{2})$, we obtain the estimate

$\displaystyle\left|x\right|-P_{n}(x)\leq\left|x\right|\left(1-\dfrac{\left|x\right|}{2}\right)^{n},\indent\forall\left|x\right|\leq1,\forall n\in\mathbb{N}$

We seek a bound independent of $\left|x\right|$ for the RHS of the inequality. The function $f(t):=t(1-\frac{t}{2})^{n}$ has critical points at

$\begin{array}{lcl}\displaystyle\left(1-\frac{t}{2}\right)^{n}-\frac{nt}{2}\left(1-\frac{t}{2}\right)^{n-1}=0&\Longrightarrow&\displaystyle1-\frac{t}{2}-\frac{nt}{2}=0\\&\Longrightarrow&t=\frac{2}{n+1}\end{array}$

Using the first-derivative test, we see that $f$ has a global maximum on $[-1,1]$ at $t=\frac{2}{n+1}$ and therefore

$\displaystyle\left|x\right|-P_{n}(x)\leq\dfrac{2}{n+1}\left(1-\dfrac{1}{n+1}\right)^{n}<\dfrac{2}{n+1},\indent\forall\left|x\right|\leq1$