# Climate
### total XP possible: 8,750


Over Fall break I flew back home to New Mexico to hike in the mountains. On the flight I read the book *The Weather Makers: How Man Is Changing the Climate and What It Means for Life on Earth* by Tim Flannery. Curious about New Mexico I went to the NOAA Climate website and saw this scary plot:




The zigzagging gray line represents the actual yearly average temperatures and the blue one smoothes out the data points. There has been a pretty dramatic increase in the temperature since the mid 1970s. As you recall from our Weather Python Notebook the rise of atmospheric CO2 has been increasing steadily. Prior to the mid-1970s we also emitted a lot of particulate matter into the atmosphere. This had the effect of reflecting sunlight so it mitigated the effects of atmospheric CO2. In the 70s many countries regulated particulate emissions because they caused things like acid rain. Without those particulates the atomospheric CO2 raised the temperatures.

I would like us to duplicate, as closely as possible, this NOAA plot.

I downloaded the data of all 50 states and converted into a form that can be read by pandas. That file is at


[https://raw.githubusercontent.com/zacharski/data101/master/climate.csv](https://raw.githubusercontent.com/zacharski/data101/master/climate.csv)
 
The format looks like:

state or region | year-month | year | TMAX | TMIN | TAVG
:---: | :---: | :---: | :---: | :---: | :---: 
Alabama|1895-01|1895|52.700000|33.400000|43.100000
Alabama|1895-02|1895|48.100000|26.800000|37.400000
Alabama|1895-03|1895|66.500000|42.400000|54.500000
Alabama|1895-04|1895|75.700000|51.200000|63.400000
Alabama|1895-05|1895|80.600000|58.400000|69.500000
Alabama|1895-06|1895|88.400000|66.500000|77.500000


So for each state (and region) we have the monthy maximum, minimum and average temperatures for each month starting from 1895 to the present.

## Part A: 3500xp

The first thing we need to do is read the file and set the index to the date.

## 1. Read the file

In [2]:
# read the file and set the index

## 2. Getting info for a particular state

To make our code flexible, let's store the state name in a variable we call `state`

In [3]:
state = 'New Mexico'

and now let's get the info for that state (you can call the variable anything you want, but suppose we call it `stateData` (be sure to use the variable `state` and not the string `New Mexico`)

In [4]:
# get data for our state

## 3. Get the annual averages

Next, let's create a new DataFrame with the average yearly minimum, average, and maximum temperatures

In [5]:
# your work here

## 4. Plot the average yearly maximum temperatures
It should look pretty zig zaggy


In [6]:
# again your work

## 5. Rolling 7 year window
We are going to try to smooth those zig zags out by computing the mean of a rolling 7 year window. Here is what I mean by a rolling window. When I take my weight every morning it goes up and down. Here is the chart for a week:

Date | Weight
:---: | :---:
10/2/2017	| 176.80
10/3/2017	| 176.20
10/4/2017	| 176.00
10/5/2017	| 174.80
10/6/2017	| 173.40
10/7/2017	| 173.80
10/8/2017 | 174.00

As you can see it goes up and down. To smooth things out, I will take the average of my weight on 3 days (on a particular day plus the two previous ones). So for 10/4/2017 I average 176.8, 176.20, and 176.0.

Date | Weight | Smoothed
:---: | :---: | :---:
10/2/2017	| 176.80 | NaN
10/3/2017	| 176.20 | NaN
10/4/2017	| 176.00 | 176.33
10/5/2017	| 174.80 | 175.67
10/6/2017	| 173.40 | 174.75
10/7/2017	| 173.80 | 174.00
10/8/2017 | 174.00 | 173.74

In the original data the weight seemed to be going up for the last three days, but in the smoothed weight, the weight was consistently going down. As you can also see, if I have a three day window, there will be NaN (not a number) for the first 2 entries.

So, again, create smoothed data by using a 7 year window. That dataframe should also not include the first 6 entries.


In [7]:
# Create a dataframe with smoothed data - remove the first 6 entries


## 6. Graph the results

I would like you to produce three plots, one right after the other, that look something like



Do not hard-code 'New Mexico' in the title, use the `state` variable

In [8]:

# put code here

# Part B: Putting it All Together - 1000xp

This should be a copy and paste of the code you did above. We are going to put the code in a function called `displayClimate`:

In [14]:
def displayClimate(state):
 
 # get info for that particular state
 
 # get the annual averages
 
 # Plot the average yearly maximum temperatures
 
 # Rolling 7 year window
 
 # Graph the results
 print('remove this print statement about %s when you add your code' % state)
 
 

Now let's test the function we wrote:

In [15]:
displayClimate('Virginia')
displayClimate('Arizona')
displayClimate('Maine')

remove this print statement about Virginia when you add your code
remove this print statement about Arizona when you add your code
remove this print statement about Maine when you add your code


# Part C: Hacker Edition extra xp

### 1. Seasonal 750-1500 xp

If you go to the [NOAA State Trend Charts](https://www.ncdc.noaa.gov/temp-and-precip/state-temps/) you will notice the small charts below the main one. So you can click on 'Summer' and see the summer statistics. Can you create a new `displayClimate` function that starts:

 displayClimate(state, season):
 
 
so if you call

 displayClimate('New Mexico', 'Annual')
 
it will display the annual charts and if you type:

 displayClimate('New Mexico', 'Summer')
 
it will display those for the summer. 

If you have those two options (annual and summer) you will get 750xp, if you do all 5 you will get 1,500.

### 2. LOESS - 1500xp
If you read the text above the NOAA chart you will see that to get the blue line they use a statistic called Loess. Seaborn has a plot called Lowess that is a variant of Loess. Here is an example of its use:


 import seaborn as sns
 # nmY is my resampled yearly New Mexico data (but not rolling window)
 nmY
 # set the y part of the chart to be between 64 and 72
 plt.ylim(64, 72)
 # label the chart
 plt.title('New Mexico Annual High Temperature')
 # generate the Lowess regression plot
 sns.regplot(data = nmY, x = 'Year', y = 'TMAX', lowess=True)
 # show the plot
 plt.show()
 
That creates a plot that looks like:


 
Can you create a new function that will display the Lowess charts for minimum, average, and maximum for a state instead of the plot of the rolling window? So, for example,

So, for example, 

 displayLowess('Maine')
 
will display the three Lowess plots for Maine.

### Departure from 20th Century Average - 1000xp
If you scroll down toward the bottom of the NOAA webpage, you will see a map of the US displaying how the decade average departs from the 20th century average. The paragraph above the map describes how they compute these numbers. I would like you to **write a function** that takes a state name and displays how much the decades depart from the 20th Century Average. Here is the algorithm I would like you to use:

1. use the non-windowed data (data that is the average yearly temps, max, min, avg).
2. the data should run from 1901 through the end of 2016
3. compute the average (max, min, avg) temps for each decade 
4. subtract the 20th century average from the decade averages
5. print the results
6. extra credit: plot the results 250xp

My results look like this for New Mexico:


Date | TMIN	| TAVG	| TMAX
:---: | :---: | :---: | :---: 			
1901-12-31	| 0.542170	| 0.336710 |	0.158764
1911-12-31	| 0.210503	| -0.379957	| -0.966236
1921-12-31	| -0.511164	| -0.956624	| -1.391236
1931-12-31	| -0.171997	| -0.465790	| -0.752069
1941-12-31	| 0.025503	| -0.097457	| -0.225402
1951-12-31	| -0.746997	| -0.210790	| 0.330431
1961-12-31	| -0.362830	| -0.020790	| 0.312098
1971-12-31	| -0.646164	| -0.456624	| -0.267069
1981-12-31	| -0.543664	| -0.364124	| -0.185402
1991-12-31	| -0.017830	| -0.267457	| -0.513736
2001-12-31	| 0.787170	| 0.945043	| 1.095431
2011-12-31	| 1.043003	| 1.290876	| 1.529598
2021-12-31	| 1.760503	| 1.900043	| 2.035431
