Assignment: Pandas Groupby with Hurricane Data#

Import Numpy, Pandas and Matplotlib and set the display options.

Use the following code to load a CSV file of the NOAA IBTrACS hurricane dataset:

import pandas as pd
url = 'https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.ALL.list.v04r00.csv'
df = pd.read_csv(url, parse_dates=['ISO_TIME'], usecols=range(12),
                 skiprows=[1], na_values=[' ', 'NOT_NAMED'],
                 keep_default_na=False, dtype={'NAME': str})
df.tail()
SID SEASON NUMBER BASIN SUBBASIN NAME ISO_TIME NATURE LAT LON WMO_WIND WMO_PRES
716160 2024147N19089 2024 27 NI BB REMAL 2024-05-27 06:00:00 NR 23.0325 89.3509 NaN NaN
716161 2024147N19089 2024 27 NI BB REMAL 2024-05-27 09:00:00 NR 23.3337 89.6178 NaN NaN
716162 2024147N19089 2024 27 NI BB REMAL 2024-05-27 12:00:00 NR 23.6263 89.8799 NaN NaN
716163 2024147N19089 2024 27 NI BB REMAL 2024-05-27 15:00:00 NR 23.9143 90.1400 NaN NaN
716164 2024147N19089 2024 27 NI BB REMAL 2024-05-27 18:00:00 NR 24.2000 90.4000 NaN NaN

Basin Key: (NI - North Indian, SI - South Indian, WP - Western Pacific, SP - Southern Pacific, EP - Eastern Pacific, NA - North Atlantic)

How many rows does this dataset have?

How many North Atlantic hurricanes are in this dataset?

1) Get the unique values of the BASIN, SUBBASIN, and NATURE columns#

2) Rename the WMO_WIND and WMO_PRES columns to WIND and PRES#

3) Get the 10 largest rows in the dataset by WIND#

You will notice some names are repeated.

4) Group the data on SID and get the 10 largest hurricanes by WIND#

5) Make a bar chart of the wind speed of the 20 strongest-wind hurricanes#

Use the name on the x-axis.

6) Plot the count of all datapoints by Basin#

as a bar chart

7) Plot the count of unique hurricanes by Basin#

as a bar chart.

8) Make a hexbin of the location of datapoints in Latitude and Longitude#

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.hexbin.html

9) Find Hurricane Katrina (from 2005) and plot its track as a scatter plot#

First find the SID of this hurricane.

Next get this hurricane’s group and plot its position as a scatter plot. Use wind speed to color the points.

10) Make time the index on your dataframe#

11) Plot the count of all datapoints per year as a timeseries#

You should use resample https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html

Which years stand out as having anomalous hurricane activity?