next up


Postscript version of these notes

STAT 804: Lecture 1

I begin the course by presenting plots of some time series and then present some discussion of the series using some of the jargon we will study. Then I will introduce some basic technical ideas.

Plots of some series

some graphs

Comments on the data sets:

I made these plots with S-Plus using the following code:

postscript("tsplots.ps",horizontal=F)
par(mfrow=c(3,2))
tsplot(sunspots,main="Mean Monthly Sunspot Numbers")
tsplot(lynx, main="Annual Sales of Lynx\n to Hudson's Bay Co.")
tsplot(flow, ylab="Cubic Meters per Second",
           main="Mean Monthly Flow\nFraser River at Hope")
tsplot(unemployment, main="Unemployment: Canada",ylab="Thousands")
tsplot(co2, main="CO2 concentration: Mauna Loa",
           ylab="Parts per Million")
tsplot(changes, main="Changes in length of day", ylab="Seconds?")
dev.off()

Basic jargon

We will study data of the sort plotted here using the idea of a stochastic process. Technically, a stochastic process is a family $\{ X_i ; i \in I\}$of random variables indexed by a set I. In practice the jargon is used only when the Xi are not independent.

If $I \subset $ Real Line, then we often call $\{ X_i ; i \in I\}$ a time series. Of course the usual situation is that i actually indexes a time point at which some measurement was made.

Two important special cases are I an interval in R, the real line, in which case we say X is a series in continuous time, and where $I \subset \{ \ldots, -3,-2,-1,0,1,2,3,\ldots\}$in which case X is in discrete time.

Here is a list of some models used for time series:

This course is about the discrete time version of these linear time series models. We will assume throughout that we have data $X_0,X_1,\ldots,X_{T-1}$ where the Xt are real random variables.

A model is a family $\{P_\theta;\theta\in\Theta\}$ of possible joint distributions for $\{X_0,\ldots,X_{T-1}\}$.

Goal: guess the true value of $\theta$. (Notice that it is an assumption that the distribution of the data is, in fact one of the possibilities.)

The question is this: is it possible to guess the true value of $\theta$? Will collecting more data (increasing T) make more accurate estimation of $\theta$ possible? The answer is no, in general. For instance in the Galton Watson process even when you watch infinitely many generations you don't get enough data to nail down the parameter values.

Example: Suppose that $(X_0,\ldots,X_{T-1})$ has a multivariate normal distribution with mean vector $(\mu_0,\ldots,\mu_{T-1})$and $T\times T$ variance covariance matrix $\Sigma$. The big problem is that with T data points you have T+T(T-1)/2parameters to estimate; this is not possible. To make progress you must put restrictions on the parameters $\mu$ and $\Sigma$. For instance you might assume one of the following:

1.
Constant mean: $\mu_t \equiv \mu$.

2.
Linear trend; $\mu_t = \alpha+\beta t$.

3.
Linear trend and sinusoidal variation:

\begin{displaymath}\mu_t = \alpha+\beta t + \gamma_1 \sin\left(\frac{2\pi t}{12}\right)
+ \gamma_2 \cos\left(\frac{2\pi t}{12}\right)
\end{displaymath}

We can estimate the p here by regression but we still have a problem - we can't get standard errors. For instance, we might estimate $\mu$ in 1) above using $\bar X$. In that case

\begin{eqnarray*}{\rm Var}(\bar X) & = T^{-2}{\rm Var}(\sum X_t)
\\
& = T^{-2}{\bf 1}^t \Sigma {\bf 1}
\\
& = T^{-2} \sum_{s,t} \Sigma_{st}
\end{eqnarray*}


where $\bf 1$ is a column vector of T 1s. SO: we must model $\Sigma$ as well as $\mu$.

The assumption we will make in this course is of stationarity:

\begin{displaymath}{\rm Cov}(X_t,X_s) = {\rm Cov} (X_{t+1},X_{s+1}) = {\rm Cov}
(X_{t+2},X_{s+2})
\cdots
\end{displaymath}

If so then for all t and h we find

\begin{displaymath}{\rm Cov}(X_t,X_{t+h}) = {\rm Cov}(X_0,X_h) \equiv C_X(h)
\end{displaymath}

(which we will call the autocovariance function of X). Then we see the a $\Sigma$ has C(0) down the diagonal, C(1) down the first sub and super diagonals, C(2) down the next sub and super diagonals and so on. Such a matrix is called a Toeplitz matrix.


next up



Richard Lockhart
1999-09-19