next up previous


Postscript version of this file

STAT 801 Lecture 6

Reading for Today's Lecture:

Goals of Today's Lecture:

Today's notes

Last time defined moments, moment generating functions, cumulants and cumulant generating functions. We established the following relations between moments and cumulants:


\begin{align*}\kappa_1(X) & = E(X)
\\
\kappa_2(X) & = {\rm Var}(X)
\\
\kappa_3...
...E[(X-E(X))^3]
\\
\kappa_4(X) & = E[(X-E(X))^4] - 3[{\rm Var}(X)]^2
\end{align*}

Example: I am having you derive the moment and cumulant generating function and all the moments of a Gamma rv. Suppose that $Z_1,\ldots,Z_\nu$ are independent N(0,1) rvs. Then we have defined $S_\nu = \sum_1^\nu Z_i^2$ to have a $\chi^2$distribution. It is easy to check S1=Z12 has density

\begin{displaymath}(u/2)^{-1/2} e^{-u/2}/(2\sqrt{\pi})
\end{displaymath}

and then the mgf of S1 is

(1-2t)-1/2

It follows that

\begin{displaymath}M_{S_\nu}(t) = (1-2t)^{-\nu/2};
\end{displaymath}

you will show in homework that this is the mgf of a Gamma$(\nu/2,2)$ rv. This shows that the $\chi^2_\nu$ distribution has the Gamma$(\nu/2,2)$ density which is

\begin{displaymath}(u/2)^{(\nu-2)/2}e^{-u/2} / (2\Gamma(\nu/2)) \, .
\end{displaymath}

Example: The Cauchy density is

\begin{displaymath}\frac{1}{\pi(1+x^2)}\, ;
\end{displaymath}

the corresponding moment generating function is

\begin{displaymath}M(t) = \int_{-\infty}^\infty \frac{e^{tx}}{\pi(1+x^2)} dx
\end{displaymath}

which is $+\infty$ except for t=0 where we get 1. This mgf is exactly the mgf of every t distribution so it is not much use for distinguishing such distributions. The problem is that these distributions do not have infinitely many finite moments.

This observation has led to the development of a substitute for the mgf which is defined for every distribution, namely, the characteristic function.

Characteristic Functions

Definition: The characteristic function of a real rv X is

\begin{displaymath}\phi_X(t) = E(e^{itX})\end{displaymath}

where $i=\sqrt{-1}$ is the imaginary unit.

Aside on complex arithmetic.

The complex numbers are the things you get if you add $i=\sqrt{-1}$ to the real numbers and require that all the usual rules of algebra work. In particular if i and any real numbers a and b are to be complex numbers then so must be a+bi. If we multiply a complex number a+bi with a and b real by another such number, say c+di then the usual rules of arithmetic (associative, commutative and distributive laws) require
\begin{align*}(a+bi)(c+di)= & ac + adi+bci+bdi^2
\\
= & ac +bd(-1) +(ad+bc)i
\\
=& (ac-bd) +(ad+bc)i
\end{align*}
so this is precisely how we define multiplication. Addition is simply (again by following the usual rules)

(a+bi)+(c+di) = (a+b)+(c+d)i

Notice that the usual rules of arithmetic then don't require any more numbers than things of the form

x+yi

where x and y are real. We can identify a single such number x+yi with the corresponding point (x,y) in the plane. It often helps to picture the complex numbers as forming a plane.

Now look at transcendental functions. For real x we know $e^x = \sum x^k/k!$ so our insistence on the usual rules working means

ex+iy = ex eiy

and we need to know how to compute eiy. Remember in what follows that i2=-1 so i3=-i, i4=1 i5=i1=i and so on. Then
\begin{align*}e^{iy} =& \sum_0^\infty \frac{(iy)^k}{k!}
\\
= & 1 + iy + (iy)^2/...
...ots
\\
& + iy -iy^3/3! +iy^5/5! + \cdots
\\
=& \cos(y) +i\sin(y)
\end{align*}
We can thus write

\begin{displaymath}e^{x+iy} = e^x(\cos(y)+i\sin(y))
\end{displaymath}

Now every point in the plane can be written in polar co-ordinates as $(r\cos\theta, r\sin\theta)$ and comparing this with our formula for the exponential we see we can write

\begin{displaymath}x+iy = \sqrt{x^2+y^2} e^{i\theta}
\end{displaymath}

for an angle $\theta\in[0,2\pi)$.

We will need from time to time a couple of other definitions:

Def'n: The modulus of the complex number x+iy is

\begin{displaymath}\vert x+iy\vert = \sqrt{x^2+y^2}
\end{displaymath}

Definition: The complex conjugate of x+iy is $\overline{x+iy} = x-iy$.

Notes on calculus with complex variables. Essentially the usual rules apply so, for example,

\begin{displaymath}\frac{d}{dt} e^{it} = ie^{it}
\end{displaymath}

We will (mostly) be doing only integrals over the real line; the theory of integrals along paths in the complex plane is a very important part of mathematics, however.

End of Aside

Since

\begin{displaymath}e^{itX} = \cos(tX) + i \sin(tX)
\end{displaymath}

we find that

\begin{displaymath}\phi_X(t) = E(\cos(tX)) + i E(\sin(tX)) \, .
\end{displaymath}

Since the trigonometric functions are bounded by 1 the expected values must be finite for all t and this is precisely the reason for using characteristic rather than moment generating functions in probability theory courses.

Theorem 1   For any two real rvs X and Y the following are equivalent:

1.
X and Y have the same distribution, that is, for any (Borel) set A we have

\begin{displaymath}P(X\in A) = P( Y \in A)
\end{displaymath}

2.
FX(t) = FY(t) for all t.

3.
$\phi_X=E(e^{itX}) = E(e^{itY}) = \phi_Y(t)$ for all real t.

Moreover, all of these are implied if there is a positive $\epsilon$ such that for all $\vert t\vert \le \epsilon$

\begin{displaymath}M_X(t)=M_Y(t) < \infty\,.
\end{displaymath}

Inversion

The previous theorem is a non-constructive characterization. It does not show us how to get from $\phi_X$ to FX or fX. For cdfs or densities with reasonable properties, however, there are effective ways to compute F or f from $\phi$. In homework I am asking you to prove the following basic inversion formula:

If X is a random variable taking only integer values then for each integer k
\begin{align*}P(X=k) & = \frac{1}{2\pi} \int_0^{2\pi} \phi_X(t) e^{-itk} dt
\\
& = \frac{1}{2\pi} \int_{-\pi}^{\pi} \phi_X(t) e^{-itk} dt \, .
\end{align*}
The proof proceeds from the formula

\begin{displaymath}\phi_X(t) = \sum_k e^{ikt} P(X=k) \, .
\end{displaymath}

Now suppose that X has a continuous bounded density f. Define

Xn = [nX]/n

where [a] denotes the integer part (rounding down to the next smallest integer). We have
\begin{align*}P(k/n \le X < (k+1)/n) = & P([nX]=k)
\\
= & \frac{1}{2\pi} \int_{-\pi}^{\pi} \phi_{[nX]}(t)
\\
&\times e^{-itk} dt \, .
\end{align*}
Make the substitution t=u/n, and get
\begin{multline*}n P(k/n \le X < (k+1)/n) =\frac{1}{2\pi}
\\
\times \int_{-n\pi}^{n\pi} \phi_{[nX]}(u/n)e^{iuk/n} du
\end{multline*}
Now, as $n\to\infty$ we have

\begin{displaymath}\phi_{[nX]}(u/n) = E(e^{iu[nX]/n}) \to E(e^{iuX})
\end{displaymath}

(by the dominated convergence theorem - the dominating random variable is just the constant 1). The range of integration converges to the whole real line and if $k/n \to x$ we see that the left hand side converges to the density f(x) while the right hand side converges to

\begin{displaymath}\frac{1}{2\pi} \int_{-\infty}^\infty \phi_X(u) e^{-iux} du
\end{displaymath}

which gives the inversion formula

\begin{displaymath}f_X(x) = \frac{1}{2\pi} \int_{-\infty}^\infty \phi_X(u) e^{-iux} du
\end{displaymath}

Many other such formulas are available to compute things like F(b) - F(a)and so on.

All such formulas are sometimes referred to as Fourier inversion formulas; the characteristic function itself is sometimes called the Fourier transform of the distribution or cdf or density of X.

Inversion of the Moment Generating Function

The moment generating function and the characteristic function are related formally by

\begin{displaymath}M_X(it) = \phi_X(t)
\end{displaymath}

When MX exists this relationship is not merely formal; the methods of complex variables mean there is a ``nice'' (analytic) function which is E(ezX) for any complex z=x+iy for which MX(x) is finite. All this means that there is an inversion formula for MX. This formula requires a complex contour integral. In general if z1 and z2 are two points in the complex plane and C a path between these two points we can define the path integral

\begin{displaymath}\int_C f(z) dz
\end{displaymath}

by the methods of line integration. When it comes to doing algebra with such integrals the usual theorems of calculus still work. The Fourier inversion formula was

\begin{displaymath}2\pi f(x) = \int_{-\infty}^\infty \phi(t) e^{-itx} dt
\end{displaymath}

so replacing $\phi$ by M we get

\begin{displaymath}2 \pi f(x) = \int_{-\infty}^\infty M(it) e^{-itx} dt
\end{displaymath}

If we just substitute z=it then we find

\begin{displaymath}2\pi i f(x) = \int_C M(z) e^{-zx} dz
\end{displaymath}

where the path C is the imaginary axis. This formula becomes of use by the methods of complex integration which permit us to replace the path Cby any other path which starts and ends at the same place. It is possible, in some cases, to choose this path to make it easy to do the integral approximately; this is what saddlepoint approximations are. This inversion formula is called the inverse Laplace transform; the mgf is also called the Laplace transform of the distribution or of the cdf or of the density.

Applications of Inversion

1): Numerical calculations

Example: Many statistics have a distribution which is approximately that of

\begin{displaymath}T= \sum \lambda_j Z_j^2
\end{displaymath}

where the Zj are iid N(0,1). In this case
\begin{align*}E(e^{itT}) & = \prod E(e^{it\lambda_j Z_j^2})
\\
& = \prod (1-2it\lambda_j)^{-1/2} \, .
\end{align*}
Imhof (Biometrika, 1961) gives a simplification of the Fourier inversion formula for

FT(x) - FT(0)

which can be evaluated numerically.

2): The central limit theorem (in some versions) can be deduced from the Fourier inversion formula: if $X_1,\ldots,X_n$ are iid with mean 0 and variance 1 and $T=n^{1/2}\bar{X}$ then with $\phi$ denoting the characteristic function of a single X we have
\begin{align*}E(e^{itT}) & = E(e^{in^{-1/2} t\sum X_j})
\\
& = \left[\phi(n^{-1...
...sqrt{n}}
+ \frac{t^2\phi^{\prime\prime}(0)}{2n}+ o(n^{-1})\right]^n
\end{align*}
But now $\phi(0) = 1$ and

\begin{displaymath}\phi^\prime(t) = \frac{d}{dt} E(e^{itX_1}) = iE(X_1e^{itX_1})
\end{displaymath}

So $\phi^\prime(0) = E(X_1) =0$. Similarly

\begin{displaymath}\phi^{\prime\prime}(t) = i^2 E(X_1^2e^{itX_1})
\end{displaymath}

so that

\begin{displaymath}\phi^{\prime\prime}(0) = -E(X_1^2) =-1
\end{displaymath}

It now follows that
\begin{align*}E(e^{itT}) & \approx [1-t^2/(2n) + o(1/n)]^n
\\
& \to e^{-t^2/2}
\end{align*}
With care we can then apply the Fourier inversion formula and get
\begin{align*}f_T(x) & = \frac{1}{2\pi i} \int_{-\infty}^\infty e^{-itx}
[\phi(t...
...infty e^{-itx} e^{-t^2/2} dt
\\
&=\frac{1}{\sqrt{2\pi}} \phi_Z(-x)
\end{align*}
where $ \phi_Z $ is the characteristic function of a standard normal variable Z. Doing the integral we find

\begin{displaymath}\phi_Z(x) = \phi_Z(-x) = e^{-x^2/2}
\end{displaymath}

so that

\begin{displaymath}f_T(x) \to \frac{1}{\sqrt{2\pi}} e^{-x^2/2}
\end{displaymath}

which is a standard normal random variable.

This proof of the central limit theorem is not terribly general since it requires T to have a bounded continuous density. The central limit theorem itself is a statement about cdfs not densities and is

\begin{displaymath}P(T \le t) \to P(Z \le t) \, .
\end{displaymath}


next up previous



Richard Lockhart
2000-01-19