import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from calendar import month_abbr
import seaborn as sns
set(style="ticks", font_scale=1.5)
sns.import urllib.request
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
6 Exercises
Import relevant packages
6.1 intra-annual variability
Go to NOAA’s National Centers for Environmental Information (NCEI)
Climate Data Online: Dataset Discovery
Find station codes in this map. On the left, click on the little wrench (🔧) next to “Global Summary of the Month”, then click on “identify” on the panel that just opened, and click on a station (purple circle). You will see the station’s name, it’s ID, and the period of record. For example, for Ben-Gurion’s Airport in Israel:
BEN GURION, IS
STATION ID: ISM00040180
Period of Record: 1951-01-01 to 2020-03-01
You can download daily or monthly data for each station. Use the function below to download this data to your computer.
def download_data(station_name, station_code):
= 'https://www.ncei.noaa.gov/data/global-historical-climatology-network-daily/access/'
url_daily = 'https://www.ncei.noaa.gov/data/gsom/access/'
url_monthly # download daily data - uncomment the next 2 lines to make this work
# urllib.request.urlretrieve(url_daily + station_code + '.csv',
# station_name + '_daily.csv')
# download monthly data
+ station_code + '.csv',
urllib.request.urlretrieve(url_monthly + '_monthly.csv') station_name
Now, choose any station with a period of record longer than 30 years, and download its data:
'BEN_GURION', 'ISM00040180') download_data(
Load the data into a datafram, and before you continue with the analysis, plot the rainfall data, to see how it looks like.
'BEN_GURION', 'ISM00040180')
download_data(= pd.read_csv('BEN_GURION_monthly.csv', sep=",")
df # make 'DATE' the dataframe index
'DATE'] = pd.to_datetime(df['DATE'])
df[= df.set_index('DATE')
df 'PRCP']) plt.plot(df[
It doesn’t look great for Ben-Gurion airport, lots of missing data! You might need to choose another station… Download data for Beer Sheva, ID IS000051690
.
'BEER_SHEVA', 'IS000051690')
download_data(= pd.read_csv('BEER_SHEVA_monthly.csv', sep=",")
df # make 'DATE' the dataframe index
'DATE'] = pd.to_datetime(df['DATE'])
df[= df.set_index('DATE')
df 'PRCP']) plt.plot(df[
That’s much better! We need to aggregate all data from each month, so we can calculate monthly averages. How to do that?
# choose only the precipitation column
= df['PRCP']
df_month # calculate monthly mean
= np.array([]) # empty array
monthly_mean = np.arange(1,13)
month_numbers = [month_abbr[i] for i in month_numbers]
month_names
for m in month_numbers: # cycle over months (1, 2, 3, etc)
= (df_month.index.month == m) # indices in df_month belonging to month m
this_month_all_indices = df_month[this_month_all_indices].mean() # this is the monthly mean
this_month_mean = np.append(monthly_mean, this_month_mean) # append monthly_mean
Now it is time to create a new dataframe with the monthly means.
= pd.DataFrame({'monthly rainfall (mm)':monthly_mean,
df_beersheva 'month names':month_names,
'month number':month_numbers
}) df_beersheva
monthly rainfall (mm) | month names | month number | |
---|---|---|---|
0 | 48.743158 | Jan | 1 |
1 | 37.347368 | Feb | 2 |
2 | 26.551579 | Mar | 3 |
3 | 9.038947 | Apr | 4 |
4 | 2.735789 | May | 5 |
5 | 0.013830 | Jun | 6 |
6 | 0.000000 | Jul | 7 |
7 | 0.002128 | Aug | 8 |
8 | 0.271277 | Sep | 9 |
9 | 6.669474 | Oct | 10 |
10 | 21.850526 | Nov | 11 |
11 | 41.786316 | Dec | 12 |
Plot the data and see if it makes sense. Try to get a figure like this one.
= plt.subplots(figsize=(10,7))
fig, ax 'month number'], df_beersheva['monthly rainfall (mm)'])
ax.bar(df_beersheva[set(xlabel="months",
ax.="monthly average (mm)",
ylabel="Beer Sheva",
title=df_beersheva['month number'],
xticks=df_beersheva['month names']);
xticklabels"hydrology_figures/beersheva_monthly_average.png") plt.savefig(
Let’s calculate now the Walsh and Lawler Seasonality Index (leddris (2010), Walsh and Lawler (1981)) Write a function that receives a dataframe like the one we have just created, and returns the seasonality index.
\(R=\) mean annual precipitation
\(m_i\) precipitation mean for month \(i\)
\[ SI = \displaystyle \frac{1}{R} \sum_{n=1}^{n=12} \left| m_i - \frac{R}{12} \right| \]
SI | Precipitation Regime |
---|---|
<0.19 | Precipitation spread throughout the year |
0.20-0.39 | Precipitation spread throughout the year, but with a definite wetter season |
0.40-0.59 | Rather seasonal with a short dry season |
0.60-0.79 | Seasonal |
0.80-0.99 | Marked seasonal with a long dry season |
1.00-1.19 | Most precipitation in < 3 months |
def walsh_index(df):
= df["monthly rainfall (mm)"]
mi = df["monthly rainfall (mm)"].sum()
R = np.sum(np.abs(mi - R/12)) / R
SI return SI
= walsh_index(df_beersheva)
beersheva_SI print(f"Beer Sheva, SI = {beersheva_SI:.2f}")
Beer Sheva, SI = 0.97