Hacker Challenges

\n", "\n", "The following requires some mental calisthenics. \n", "\n", "### Part 1. \n", "We would like to see a yearly plot of the number of days 90 or over for Fredericksburg. So the x axis would be the years 2010 to 2017. Which year had the most days over 90?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Part 2. even more challenging.\n", "We would like to see a plot similar to that in Part 1, but showing data for both Fredericksburg and Las Cruces. That is, for each year we see the days 90 or over for Las Cruces, and the days 90 or over for Fredericksburg.\n", "\n", "\n", "Here is a bit of a hint. (This was my approach - you might have a different, better one). I had two Pandas Series. One, cc was the number of days 90 or higher for Las Cruces. It looked like this:\n", "\n", " cc.head()\n", " \n", " DATE\n", " 2000-12-31 142\n", " 2001-12-31 121\n", " 2002-12-31 117\n", " 2003-12-31 117\n", " 2004-12-31 106\n", "\n", "And had a similar one, ff for Fredericksburg. Then I combined them into one DataFrame by:\n", "\n", " combined = pd.DataFrame({'Las Cruces': cc, 'Fredericksburg' : ff})\n", " \n", " combined.head()\n", " \n", " FR\tLC\n", " DATE\t\t\n", " 2000-12-31\t24\t142\n", " 2001-12-31\t25\t121\n", " 2002-12-31\t57\t117\n", " 2003-12-31\t26\t117\n", " 2004-12-31\t24\t106\n", " \n", " \n", "After that, the plotting was easy.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#TBD" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The average max weekly temperatures of Fredericksburg in 2016\n", "\n", "What we mean:\n", "\n", " | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday\n", " ---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: \n", " Max Temp | 82 | 84 | 82 | 75 | 77 | 87 | 89\n", " \n", " The average max weekly temperature for that week \n", " \n", " $$avgMaxWeekly = \\frac{82 + 84 + 82 + 75 + 77 + 87 + 89}{7} = \\frac{576}{7} = 82.2857$$\n", " \n", " We would like to see a plot for the whole year:\n", " " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "#TBD\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Hacker Challenge

\n", "\n", "Can you do the same (the average max weekly temperature plot) for both Fredericksburg and Las Cruces in one plot?" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#TBD\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The total annual precipitation amounts for both Fredericksburg and Las Cruces\n", "A plot showing the amounts from 2010 through 2017. (a plot showing 2010, 2011, 2012, etc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A non-plot question.\n", "What is the average yearly precipitation amounts for Fredericksburg and Las Cruces?" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# TBD" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Climate Change: Atmospheric Carbon Dioxide\n", "Before the industrial revolution atmospheric carbon dioxide was about 280 ppm (parts per million). When we first started measuring its concentration at Mauna Loa Hawaii in 1958 the concentration was 315.\n", "\n", "The data from this location is in the CSV file:\n", "\n", "[co2_mm_mlo.csv](https://raw.githubusercontent.com/zacharski/data101/master/co2_mm_mlo.csv)\n", " \n", "\n", "The following information from the original dataset is important:\n", "\n", "> Data from March 1958 through April 1974 have been obtained by C. David Keeling\n", "> of the Scripps Institution of Oceanography (SIO) and were obtained from the\n", "> Scripps website (scrippsco2.ucsd.edu).\n", ">\n", "> The \"average\" column contains the monthly mean CO2 mole fraction determined\n", "> from daily averages. The mole fraction of CO2, expressed as parts per million\n", "> (ppm) is the number of molecules of CO2 in every one million molecules of dried\n", "> air (water vapor removed). If there are missing days concentrated either early\n", "> or late in the month, the monthly mean is corrected to the middle of the month\n", "> using the average seasonal cycle. Missing months are denoted by -99.99.\n", "> The \"interpolated\" column includes average values from the preceding column\n", "> and interpolated values where data are missing. Interpolated values are\n", "> computed in two steps. First, we compute for each month the average seasonal\n", "> cycle in a 7-year window around each monthly value. In this way the seasonal\n", "> cycle is allowed to change slowly over time. We then determine the \"trend\"\n", "> value for each month by removing the seasonal cycle; this result is shown in\n", "> the \"trend\" column. Trend values are linearly interpolated for missing months.\n", "> The interpolated monthly mean is then the sum of the average seasonal cycle\n", "> value and the trend value for the missing month.\n", ">\n", "> NOTE: In general, the data presented for the last year are subject to change, \n", "> depending on recalibration of the reference gas mixtures used, and other quality\n", "> control procedures. Occasionally, earlier years may also be changed for the same\n", "> reasons. Usually these changes are minor.\n", ">\n", "> CO2 expressed as a mole fraction in dry air, micromol/mol, abbreviated as ppm\n", ">\n", "> (-99.99 missing data; -1 no data for >daily means in month)\n", "\n", "**Please give a monthly plot of the atmospheric carbon (extra xp for making a pretty plotTM).**\n", "\n", "### Hint:\n", "\n", "The date has a year and a month column:\n", "\n", "\n", "year |\tmonth |\tdecimal_date\t| average\t| interpolated | \ttrend |\tdays\n", ":---: | :---: | :---: | :---: | :---: | :---: | :---: \n", "1958 |\t3\t| 1958.208\t| 315.71\t| 315.71 |\t314.62 |\t-1\n", "1958 |\t4\t| 1958.292\t| 317.45\t| 317.45 |\t315.29\t| -1\n", "1958 |\t5\t| 1958.375\t| 317.50\t| 317.50 |\t314.71\t| -1\n", "\n", "Let's say you wanted to combine the year and month to create a Pandas Series with entries like '1958-03' and so on. \n", "\n", "If our original Pandas DataFrame is called carbon we can create a series called date_string by executing:\n", "\n", "\n", " date_string = carbon['year'].astype(str) + '-' + carbon['month'].apply(lambda x:\"%02i\" % x)\n", " \n", "For more of a hint see the DataCamp page *Cleaning and tidying datetime data*\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Now we would like to see a plot of the average daily atmospheric carbon for every 5 years. **" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of those plots looked saw-toothed leading us to wonder if some months of the year had lower atmospheric carbon than others. For example, maybe it is low during winter months. Can you come up with a plot that will help us answer this question?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Before We Start\n", "\n", "Suppose we have the small DataFrame" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Namefinalmidterm
0Ann8987
1Ben8175
2Clara9997
3Dora9581
4Enric6065
5Fred9391
6Ginny8785
7Hannah9996
\n", "
" ], "text/plain": [ " Name final midterm\n", "0 Ann 89 87\n", "1 Ben 81 75\n", "2 Clara 99 97\n", "3 Dora 95 81\n", "4 Enric 60 65\n", "5 Fred 93 91\n", "6 Ginny 87 85\n", "7 Hannah 99 96" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "names = ['Ann', 'Ben', 'Clara', \"Dora\", 'Enric', 'Fred', 'Ginny', 'Hannah']\n", "midtermGrades = [87, 75, 97, 81, 65, 91, 85, 96]\n", "finalGrades = [89, 81, 99, 95, 60, 93, 87, 99]\n", "grades = pd.DataFrame({'Name': names, 'midterm': midtermGrades, 'final': finalGrades})\n", "grades\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can sort the data by the values in the final column by:\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Namefinalmidterm
2Clara9997
7Hannah9996
3Dora9581
5Fred9391
0Ann8987
6Ginny8785
1Ben8175
4Enric6065
\n", "
" ], "text/plain": [ " Name final midterm\n", "2 Clara 99 97\n", "7 Hannah 99 96\n", "3 Dora 95 81\n", "5 Fred 93 91\n", "0 Ann 89 87\n", "6 Ginny 87 85\n", "1 Ben 81 75\n", "4 Enric 60 65" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gradesSorted = grades.sort_values('final', ascending=False)\n", "gradesSorted" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, if we want, we can make a new dataframe of the top 3 students:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Namefinalmidterm
2Clara9997
7Hannah9996
3Dora9581
5Fred9391
\n", "