20  autocorrelation

See the temperatures for Jerusalem in a 4-day interval:

20.1 question

If I know the temperature right now, what does that tell me about the temperature 10 minutes from now? How about 100 minutes? 1000 minutes?

To answer this, we need to talk about autocorrelation. Let’s start by introducing the necessary concepts.

20.2 mean and standard deviation

Let’s call our time series from above \(X\), and its length \(N\). Then:

\[ \begin{aligned} \text{mean}& &\mu &= \frac{\displaystyle\sum_{i=1}^N X_i}{N} \\ \text{standard deviation}& &\sigma &= \sqrt{\frac{\displaystyle\sum_{i=1}^N (X_i-\mu)^2}{N}} \end{aligned} \]

The mean and standard deviation can be visualized thus:

One last basic concept we need is the expected value: \[ E[X] = \sum_{i=1}^N X_i p_i \]

For our time series, the probability \(p_i\) that a given point \(X_i\) is in the dataset is simply \(1/N\), therefore the expectation becomes

\[ E[X] = \frac{\displaystyle\sum_{i=1}^N X_i}{N} \]

20.3 autocorrelation

The autocorrelation of a time series \(X\) is the answer to the following question:

if we shift \(X\) by \(\tau\) units, how similar will this be to the original signal?

In other words:

how correlated are \(X(t)\) and \(X(t+\tau)\)?

Using the Pearson correlation coefficient

Pearson correlation coefficient between \(X\) and \(Y\): \[ \rho_{X,Y} = \frac{E\left[ (X - \mu_X)(X_Y - \mu_Y) \right]}{\sigma_X\sigma_Y} \]

we get

\[ \rho_{XX}(\tau) = \frac{E\left[ (X_t - \mu)(X_{t+\tau} - \mu) \right]}{\sigma^2} \]

A video is worth a billion words, so let’s see the autocorrelation in action:

A few comments:

  • The autocorrelation for \(\tau=0\) (zero shift) is always 1.
    [Can you prove this? All the necessary equations are above!]