9  practice

Let’s try to answer the following questions:

First we have to divide temperature data by month, and then take the average for each month.

a possible solution
df_month = df['temperature'].resample('M').mean()

This is a bit trickier.

  1. We need to find the maximum/minimum temperature for each day.
  2. Only then we split the daily data by month and take the average.
a possible solution
df_day['max temp'] = df['temperature'].resample('D').max()
df_month['max temp'] = df_day['max temp'].resample('MS').mean()
  1. We need to filter our data to contain only night times.
  2. We need to divide rain data by seasons (3 months), and then take the mean for each season.
a possible solution
# filter only night data
df_night = df.loc[((df.index.hour < 6) | (df.index.hour >= 18))]
season_average_night_temp = df_night['temperature'].resample('Q').mean()
another possible solution
# filter using between_time
df_night = df.between_time('18:00', '06:00', inclusive='left')
season_average_night_temp = df_night['temperature'].resample('Q').mean()

First we have to divide rain data by day, and then take the sum for each day.

a possible solution
daily_precipitation = df['rain'].resample('D').sum()

We have to divide rain data by month, and then sum the totals of each month.

a possible solution
monthly_precipitation = df['rain'].resample('M').sum()
  1. We need to sum rain by day.
  2. We need to count how many days are there each month where rain > 0.
a possible solution
daily_precipitation = df['rain'].resample('D').sum()
only_rainy_days = daily_precipitation.loc[daily_precipitation > 0]
rain_days_per_month = only_rainy_days.resample('M').count()
  1. We need to divide our data into two: rainy_season_1 and rainy_season_2.
  2. We need to find the time of the last rain in rainy_season_1.
  3. We need to find the time of the first rain in rainy_season_2.
  4. We need to compute the time difference between the two dates.
a possible solution
split_date = '2019-08-01'
rainy_season_1 = df[:split_date]  # everything before split date
rainy_season_2 = df[split_date:]  # everything after split date
malkosh = rainy_season_1['rain'].loc[rainy_season_1['rain'] > 0].last_valid_index()
yoreh = rainy_season_2['rain'].loc[rainy_season_2['rain'] > 0].first_valid_index()
dry_period = yoreh - malkosh
# extracting days, hours, and minutes
days = dry_period.days
hours = dry_period.components.hours
minutes = dry_period.components.minutes
print(f'The dry period of 2019 was {days} days, {hours} hours and {minutes} minutes.')
  1. We need to filter our data to contain only morning times.
  2. We need to sum rain by day.
  3. We need to find the day with the maximum value.
a possible solution
# filter to only day data
morning_df = df.loc[((df.index.hour >= 6) & (df.index.hour < 18))]
morning_rain = morning_df['rain'].resample('D').sum()
rainiest_morning = morning_rain.idxmax()
# plot
morning_rain.plot()
plt.axvline(rainiest_morning, c='r', alpha=0.5, linestyle='--')
bonus solution
# filter to only night data
df_night = df.loc[((df.index.hour < 6) | (df.index.hour >= 18))]
# resampling night for each day is tricky because the date changes at 12:00. We can do this trick:
# we shift the time back by 6 hours so all the data for the same night will have the same date.
df_shifted = df_night.tshift(-6, freq='H')
night_rain = df_shifted['rain'].resample('D').sum()
rainiest_night = night_rain.idxmax()
# plot
night_rain.plot()
plt.axvline(rainiest_night, c='r', alpha=0.5, linestyle='--')

Note: this whole webpage is actually a Jupyter Notebook rendered as html. If you want to know how to make interactive graphs, go to the top of the page and click on “ Code”