import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
import scipy.stats as stats
from sklearn.ensemble import RandomForestRegressor
import concurrent.futures
from datetime import datetime, timedelta
from sklearn.cluster import KMeans
import math
import scipy
from scipy.signal import find_peaks
# %matplotlib widget
49 filtering 1
Download this csv file before you start. It contains the dataset used in this notebook.
In this practice we will filter frequencies in our data.
Import data
T | |
---|---|
datetime | |
2000-01-01 00:00:00 | 16.791667 |
2000-01-01 02:00:00 | 16.975000 |
2000-01-01 04:00:00 | 16.825000 |
2000-01-01 06:00:00 | 17.050000 |
2000-01-01 08:00:00 | 19.900000 |
... | ... |
2020-12-31 14:00:00 | 17.341667 |
2020-12-31 16:00:00 | 14.900000 |
2020-12-31 18:00:00 | 13.308333 |
2020-12-31 20:00:00 | 12.925000 |
2020-12-31 22:00:00 | 12.983333 |
92052 rows × 1 columns
plot
49.0.1 Apply FFT
Keep only positive k values
np.seterr(divide='ignore')
fig, ax = plt.subplots()
ax.plot(k, fft_abs)
ax.set(xlabel="hours/cycle",
ylabel=r"$|F|$")
ax.plot(k[peaks], fft_abs[peaks], "x")
# ax.set_xscale('log')
# ax.set_xlim(-50,450)
for peak in peaks:
ax.text(k[peak], fft_abs[peak], f'{k[peak]:.6f}', ha='center', va='bottom')
None
# print(1/k[peaks])
49.1 filtering
By transforming our signal into an array of frequencies, each with its own weight, we’ve prepared it for manipulation. This array represents the composite frequencies that construct our original signal. Through manipulation—specifically, filtering—we can adjust the weights of these frequencies, often zeroing some out. Filtering allows us to modify the original signal in a controlled manner.
def fft_filter(array, condition):
# this function recieves 2 arguments:
# array - np array of the timeseries that needs filtering
# condition - np array of a mask of boolean values
# true values will be kept and false values will be set to 0
# FFT the signal
sig_fft = scipy.fft.fft(array)
# copy the FFT results
sig_fft_filtered = sig_fft.copy()
sig_fft_filtered[~condition] = 0
# get the filtered signal in time domain
filtered = scipy.fft.ifft(sig_fft_filtered)
return filtered
49.2 low-pass filter
cut_off = 0.005
fig, ax = plt.subplots()
ax.plot(k, fft_abs)
ax.set(xlabel="frequency (1/hour)",
ylabel=r"$|F|$",
title=r"lowpass")
ax.axvspan(0, cut_off, color='lightblue', alpha=0.5, label='passed frequencies')
ax.legend()
fig, ax = plt.subplots()
ax.plot(x)
condition = np.abs(freq) < cut_off
ax.plot(fft_filter(x, condition))
/opt/anaconda3/lib/python3.9/site-packages/matplotlib/cbook/__init__.py:1298: ComplexWarning: Casting complex values to real discards the imaginary part
return np.asarray(x, float)
By applying a lowpass filter we removed the high frequency daily signal while keeping the low frequency signal of the yearly oscillation.
49.3 high-pass filter
cut_off = 0.005
fig, ax = plt.subplots()
ax.plot(k, fft_abs)
ax.set(xlabel="frequency (1/hour)",
ylabel=r"$|F|$",
title=r"highpass")
# ax.axvspan(0, cut_off, color='lightblue', alpha=0.5)
ax.axvspan(cut_off, np.max(k), color='lightblue', alpha=0.5, label='passed frequencies')
ax.legend()
fig, ax = plt.subplots()
ax.plot(x)
condition = np.abs(freq) > cut_off
ax.plot(fft_filter(x, condition) + np.mean(x))
By applying a highpass filter we removed the low frequency signal of the yearly oscillation while keeping the high frequency daily signal
49.4 band-pass filter
low_thresh = 0.005
high_thresh = 0.1
fig, ax = plt.subplots()
ax.plot(k, fft_abs)
ax.set(xlabel="frequency (1/hour)",
ylabel=r"$|F|$",
title=r"bandpass")
ax.axvspan(low_thresh, high_thresh, color='lightblue', alpha=0.5, label='passed frequencies')
ax.legend()
fig, ax = plt.subplots()
ax.plot(x)
condition = (np.abs(freq) > low_thresh) & (np.abs(freq) < high_thresh)
ax.plot(fft_filter(x, condition) + np.mean(x))
By applying a bandpass filter we removed the low frequency signal of the yearly oscillation, and smoothed the daily signal by removing high frequencies. The results are a smooth daily signal without the yearly seasonal component.
49.5 band-stop filter
The fourth common type of filter, alongside low-pass, high-pass, and band-pass filters, is the band-stop filter (also known as a notch filter). A band-stop filter attenuates frequencies within a certain range, allowing frequencies outside of that range to pass through. It’s essentially the opposite of a band-pass filter, which allows frequencies within a certain range to pass while attenuating frequencies outside of that range.
low_thresh = 0.005
high_thresh = 0.1
fig, ax = plt.subplots()
ax.plot(k, fft_abs)
ax.set(xlabel="frequency (1/hour)",
ylabel=r"$|F|$",
title=r"bandstop")
ax.axvspan(0, low_thresh, color='lightblue', alpha=0.5)
ax.axvspan(high_thresh, np.max(k), color='lightblue', alpha=0.5, label='passed frequencies')
ax.legend()
fig, ax = plt.subplots()
ax.plot(x)
condition = ~(np.abs(freq) > low_thresh) & (np.abs(freq) < high_thresh)
ax.plot(fft_filter(x, condition))
By applying a bandstop filter we keep the low frequency signal of the yearly oscillation, and the high frequency noise, but we remove the daily signal by removing the band that corresponds with daily oscillation.