next up previous


Postscript version of these notes

STAT 804

Lecture 11

Likelihood Theory

First we review likelihood theory for conditional and full maiximum likelihood estimation.

Suppose the data is X=(Y,Z) and write the density of X as

\begin{displaymath}f(x\vert\theta) = f(y\vert z,\theta)f(z\vert\theta)
\end{displaymath}

Differentiate the identity

\begin{displaymath}1 = \int f(y\vert z,\theta) dy
\end{displaymath}

with respect to $\theta_j$ (the jth component of $\theta$) and pull the derivative under the integral sign to get
\begin{align*}0 & = \int \frac{\partial f(y\vert z,\theta_j)}{\partial\theta} dy...
...z,\theta) dy
\\
&= \text{E}_\theta(U_{Y\vert Z;j}(\theta)\vert Z)
\end{align*}
where $U_{Y\vert Z;j}(\theta)$ is the jth component of $ U_{Y\vert Z}(\theta)$, the derivative of the log conditional likelihood; UY|Z is called a conditional score. Since

\begin{displaymath}\text{E}_\theta(U_{Y\vert Z;j}(\theta)\vert Z) = 0
\end{displaymath}

we may take expected values to see that

\begin{displaymath}\text{E}_\theta(U_{Y\vert Z;j}(\theta)) = 0
\end{displaymath}

It is also true that the other two scores $U_X(\theta)$ and $U_Z(\theta)$ have mean 0 (when $\theta$ is the true value of $\theta$). Differentiate the identity a further time with respect to $\theta_k$to get

\begin{displaymath}0 = \int \frac{\partial^2 \log
f(y\vert z,\theta)}{\partial\t...
...g f(y\vert z,\theta_k)}{\partial\theta}
f(y\vert z,\theta) dy
\end{displaymath}

We define the conditional Fisher information matrix $ I_{Y\vert Z}(\theta)$ to have jkth entry

\begin{displaymath}\text{E}
\left[- \frac{\partial^2 \ell}{\partial\theta_j\partial\theta_k}\vert Z\right]
\end{displaymath}

and get

\begin{displaymath}I_{Y\vert Z}(\theta\vert Z) = \text{Var}_\theta(U_{Y\vert Z}(\theta)\vert Z)
\end{displaymath}

The corresponding identities based on fX and fZ are

\begin{displaymath}I_X(\theta) = \text{Var}_\theta(U_X(\theta))
\end{displaymath}

and

\begin{displaymath}I_Z(\theta) = \text{Var}_\theta(U_Z(\theta))
\end{displaymath}

Now let's look at the model $X_t = \rho X_{t-1}+\epsilon_t$. Putting $Y=(X_1,\ldots,X_{T-1})$ and Z=X0 we find

\begin{displaymath}U_{Y\vert Z}(\rho,\sigma) =
\begin{array}{l}
\frac{\sum_1^{T-...
..._t-\rho X_{t-1})^2}{\sigma^3} -\frac{T-1}{\sigma}
\end{array}
\end{displaymath}

Differentiating again gives the matrix of second derivatives

\begin{displaymath}\left[
\begin{array}{cc}
-\frac{\sum_1^{T-1}X_{t-1}^2}{\sigma...
...X_{t-1})^2}{\sigma^4} +\frac{T-1}{\sigma^2}
\end{array}\right]
\end{displaymath}

Taking conditional expectations given X0 gives

\begin{displaymath}I_{Y\vert Z}(\rho,\sigma) = \left[
\begin{array}{cc}
\frac{\s...
...igma^2}
&
0 \\ 0 &
\frac{2(T-1)}{\sigma^2}
\end{array}\right]
\end{displaymath}

To compute $W_k \equiv \text{E}[X_k^2\vert X_0]$ write $X_k = \rho X_{k-1}+\epsilon_k$and get

\begin{displaymath}W_k =\rho^2 W_{k-1} + \sigma^2
\end{displaymath}

with W0=X02. You can check carefully that in fact Wk converges to some $W_\infty$as $k\to\infty$. This $W_\infty$ satisfies $W_\infty = \rho^2W_\infty + \sigma^2
$which gives

\begin{displaymath}W_\infty = \frac{\sigma^2}{1-\rho^2}
\end{displaymath}

It follows that

\begin{displaymath}\frac{1}{T} I_{Y\vert Z}(\rho,\sigma) \to
\left[
\begin{arr...
...c{1}{1-\rho^2} & 0 \\ 0 & \frac{2}{\sigma^2}\end{array}\right]
\end{displaymath}

Notice that although the conditional Fisher information might have been expected to depend on X0 it does not, at least for long series.


next up previous



Richard Lockhart
1999-11-01