{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "
\n", "\"Unidata\n", "
\n", "\n", "

Numpy

\n", "

Unidata AMS 2021 Student Conference

\n", "\n", "
\n", "
\n", "\n", "---\n", "
\"Timeseries
\n", "\n", "\n", "### Focuses\n", "* Using this notebook we will become familiar with [Numpy](https://numpy.org/learn/)\n", "* Learn basic usage of numpy arrays instead of lists\n", "* We will use numpy to organize and manipulate data\n", "* Plot data from a numpy array [Matplotlib](https://matplotlib.org/)\n", "\n", "\n", "\n", "\n", "### Objectives\n", "1. [Numpy gymnastics](#1.-Numpy-gymnastics)\n", "1. [Working with multidimensional data](#2.-Working-with-multidimensional-data)\n", "1. [Plot random station data](#3.-Plot-random-station-data)\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Numpy gymnastics\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below is a list of integers that is 20 elements long. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "integers = [11, -47, 39, 21, -5, -27, -33, 18, -9, 12, 14, -44, 10, 25, 18, -16, -19, 22, 44, 23] \n", "print(len(integers))\n", "print(type(integers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's say we need to add 20 to every element in this list. How can we do this? \n", "Typically, this would be done with a [For loop](https://www.w3schools.com/python/python_for_loops.asp)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "integers_plus_20 = [] # This will be our new list of integers after we add 20 to each element\n", "for i in integers: # This will loop through each element in our list\n", " new_integer = i + 20 # At each elememnt we add 20\n", " integers_plus_20.append(new_integer) # Now append this new element to our new list\n", "print(integers_plus_20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This took us 4 lines of code and a print statement to make sure all was going right. You can imagine that for a more complicated set of operations that this code can quickly inflate, leaving lots of room for error. A faster and easier way to do operations on a whole list at once is to use numpy. \n", "\n", "To use numpy we can declare our list of integers as a numpy array using the np.array() function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "numpy_array_of_integers = np.array(integers)\n", "print(numpy_array_of_integers)\n", "print(type(numpy_array_of_integers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we are working with a numpy array, let's try to perform the same operation: adding 20 to each element in our numpy array." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "numpy_array_plus_20 = numpy_array_of_integers + 20\n", "print(numpy_array_plus_20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see that using numpy we only need one line of code plus a print statement to get the same result. The benefit of numpy is that we can manipulate data much more quickly and efficiently using numpy arrays compared to native lists. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Working with multidimensional data\n", "Under this objective, we will see how numpy arrays can help us handle multidimensional data. Below are random values that we will pretend are temperatures from 5 different stations. We can generate random data using numpy." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# below we have 10 random measurements of temperature at 5 stations\n", "station_1 = np.random.randint(low=60, high=70, size=10)\n", "station_2 = np.random.randint(low=50, high=60, size=10)\n", "station_3 = np.random.randint(low=40, high=50, size=10)\n", "station_4 = np.random.randint(low=30, high=40, size=10)\n", "station_5 = np.random.randint(low=20, high=30, size=10)\n", "\n", "# If we composite all of these stations in a list, we then have a list with 5 elements, each with 10 measurements.\n", "station_data_list = [station_1, station_2, station_3, station_4, station_5]\n", "\n", "# Finally, we declare our list as a numpy array\n", "station_data_np_array = np.array(station_data_list)\n", "\n", "# Print the shape of the station data numpy array\n", "print(np.shape(station_data_np_array)) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Perfect! The shape is (5, 10): 5 for each of the stations and 10 because we have 10 measurements for each station. Numpy also makes matrix operations very easy in python. If we wanted to swap the rows and columns we can use the np.transpose() function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_data_np_transpose = np.transpose(station_data_np_array)\n", "print(np.shape(station_data_np_transpose))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Plot random station data\n", "\n", "We could plot temperature and pressure data by subsetting the columns as we did above and using matplotlib. Instead, we will try to use the [pandas](https://pandas.pydata.org/) built in plot functions!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(station_data_np_transpose)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "station_labels = ['Station 1', 'Station 2', 'Station 3', 'Station 4', 'Station 5']\n", "StationPlots = plt.plot(station_data_np_transpose, label=station_labels)\n", "plt.legend(iter(StationPlots), station_labels, loc=1)\n", "plt.title('Random Station Data')\n", "plt.ylabel('Value')\n", "plt.xlabel('Time')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Congrats!\n", "Nice job! These are the basics of using numpy. If you want more practice check out [this link](https://www.w3schools.com/python/numpy_intro.asp)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## See also\n", "* [numpy documentation](https://numpy.org/learn/)\n", "* [pandas documentation](https://pandas.pydata.org/) \n", "* [matplotlib documentation](https://matplotlib.org/)" ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:pyaos-ams-2021]", "language": "python", "name": "conda-env-pyaos-ams-2021-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }