# 25 autocorrelation

In this section, we will make use of a few fundamental concepts from statistics. Knowing these concepts well is fundamental to make sense of stationarity.

## 25.1 mean and standard deviation

Let’s call our time series X, and its length N. Then:

\begin{aligned} \text{mean}& &\mu &= \frac{\displaystyle\sum_{i=1}^N X_i}{N} \\ \text{standard deviation}& &\sigma &= \sqrt{\frac{\displaystyle\sum_{i=1}^N (X_i-\mu)^2}{N}} \end{aligned}

The mean and standard deviation can be visualized thus:

## 25.2 expected value

The expected value (or expectation) of a variable X is given by E[X] = \sum_{i=1}^N X_i p_i.

p_i is the weight or probability that X_i occurs. For a time series, the probability p_i that a given point X_i is in the dataset is simply 1/N, therefore we can write the following measures in terms of expected values:

- mean, also called 1st moment: \mu = E[X].
- variance, also called 2nd moment: \sigma^2 = E[(X-E[X])^2] = E[(X-\mu)^2]. Of course, \sigma is called the standard deviation.

## 25.3 covariance

The covariance between two time series X and Y is given by

\begin{split} \text{cov}(X,Y) &= E[(X-E[X])(Y-E[Y])]\\ &= E[(X-\mu_X)(Y-\mu_Y)] \end{split}

Compare this to the definition of the variance, and it is obvious that the covariance \text{cov(X,X)} of a time series with itself is its variance.

## 25.4 correlation

We are almost there. I promise.

The fact that \text{cov(X,X)} = \sigma_X^2 begs us to define a new measure, the correlation:

\text{corr}(X,Y) = \frac{E[(X-\mu_X)(Y-\mu_Y)]}{\sigma_X \sigma_Y}.

This is convenient, because now we can say that the correlation of a time series with itself is \text{corr}(X,X)=1.

This is also called the Pearson correlation coefficient, and the result has a value between 1 and -1.

Source: Wikimedia

## 25.5 autocorrelation

The autocorrelation of a time series X is the answer to the following question:

if we shift X by \tau units, how similar will this be to the original signal?

In other words:

how correlated are X(t) and X(t+\tau)?

The autocorrelation is expressed as \rho_{XX}(\tau) = \frac{E\left[ (X_t - \mu)(X_{t+\tau} - \mu) \right]}{\sigma^2}

In other disciplines, the autocorrelation is simply the autocovariance, i.e., it is not normalized by dividing by \sigma^2. In time series it is assumed that the autocorrelation is always normalized, therefore between -1 and 1.

The autocorrelation function \rho_{XX}(\tau) provides a useful measure of the degree of dependence among the values of a time series at different times.

A video is worth a billion words, so let’s see the autocorrelation in action:

A few comments:

- The autocorrelation for \tau=0 (zero shift) is always 1.

[Can you prove this? All the necessary equations are above!]