{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Scientific Python\n", "\n", "Python has a large number of tools available for doing data science, sometimes referred to as 'scientific Python'. \n", "\n", "The scientific Python ecosystem revolves around some a set of core modules, including:\n", "\n", "- `scipy`\n", "- `numpy`\n", "- `pandas`\n", "- `matplotlib`\n", "- `scikit-learn`\n", "\n", "Here we will explore the basics of these modules and what they do, starting with scipy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Scipy is an 'ecosystem', including a collection of open-source packages for scientific computing in Python.\n", "
\n", "\n", "
\n", "The scipy organization website is \n", "here,\n", "including a \n", "description\n", "of the 'ecosystem', materials for \n", "getting started, \n", "and extensive \n", "tutorials.\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# You can import the full scipy package, typically shortened to 'sp'\n", "import scipy as sp\n", "\n", "# However, it is perhaps more common to import particular submodules\n", "# For example, let's import the stats submodule\n", "import scipy.stats as sts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Scipy has a broad range of functionality.\n", "\n", "For a simple / random example, let's use it's stats module to model flipping a coin with [Bernouilli Distribution](https://en.wikipedia.org/wiki/Bernoulli_distribution), which is a distribution that can model a random variable that can be either 0 (call it Tails) or 1 (call it Heads). " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's model a fair coin - with 0.5 probability of being Heads\n", "sts.bernoulli.rvs(0.5)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The first ten coin flips are: [1, 0, 0, 1, 0, 1, 1, 0, 1, 0]\n", "The percent of heads from this sample is: 59.0 %\n" ] } ], "source": [ "# Let's flip a bunch of coins!\n", "coin_flips = [sts.bernoulli.rvs(0.5) for ind in range(100)]\n", "print('The first ten coin flips are: ', coin_flips[:10])\n", "print('The percent of heads from this sample is: ', sum(coin_flips) / len(coin_flips) * 100, '%')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Numpy contains an array object (for multi-dimensional data, typically of uniform type), and operations for linear algrebra and analysis on arrays. \n", "
\n", "\n", "
\n", "The numpy website is\n", "here,\n", "including their official \n", "quickstart tutorial.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Note: \n", "An array is a 'a systematic arrangement of similar objects, usually in rows and columns' (definition from [Wikipedia](https://en.wikipedia.org/wiki/Array))" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Numpy is standardly imported as 'np'\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Numpy's specialty is linear algebra and arrays of (uniform) data \n", "\n", "# Define some arrays\n", "# Arrays can have different types, but all the data within an array needs to be the same type\n", "arr_1 = np.array([1, 2, 3])\n", "arr_2 = np.array([4, 5, 6])\n", "bool_arr = np.array([True, False, True])\n", "str_arr = np.array(['a', 'b', 'c'])" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n" ] } ], "source": [ "# Note that if you try to make a mixed-data-type array, numpy won't fail, \n", "# but it will (silently)\n", "arr = np.array([1, 'b', True])\n", "\n", "# Check the type of array items\n", "print(type(arr[0]))\n", "print(type(arr[2]))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True\n", "False\n" ] } ], "source": [ "# These array will therefore not act like you might expect\n", "# The last item looks like a Boolen\n", "print(arr[2])\n", "\n", "# However, since it's actually a string, it won't evaluate like a Boolean\n", "print(arr[2] == True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "For more practice with numpy, check out the collection \n", "numpy exercises.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Pandas is a package for organizing data in data structures, and performing data analysis upon them.\n", "
\n", "\n", "
\n", "The official pandas website is \n", "here,\n", "including materials such as \n", "10 minutes to pandas\n", "and a tutorial on \n", "essential basic functionality.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas main data object is the DataFrame, which is a powerful data object to store mixed data types together with labels. \n", "\n", "Pandas dataframes also offer a large range of available methods for processing and analyzing data.\n", "\n", "If you are familiar with R, pandas dataframes object and approaches are quite similar to R." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Pandas is standardly imported as pd\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "# Let's start with an array of data, but we also have a label for each data item\n", "dat_1 = np.array(['London', 'Washington', 'London', 'Budapest'])\n", "labels = ['Ada', 'Alonzo', 'Alan', 'John']" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Pandas offers the 'Series' data object to store 1d data with axis labels\n", "pd.Series?" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Ada London\n", "Alonzo Washington\n", "Alan London\n", "John Budapest\n", "dtype: object" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's make a Series with out data, and check it out\n", "ser_1 = pd.Series(dat_1, labels)\n", "ser_1.head()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Ada 36\n", "Alonzo 92\n", "Alan 41\n", "John 53\n", "dtype: int64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# If we have some different data (with the same labels) we can make another Series\n", "dat_2 = [36, 92, 41, 53]\n", "ser_2 = pd.Series(dat_2, labels)\n", "\n", "ser_2.head()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# However, having a collection of series can quickly get quite messy\n", "# Pandas therefore offer the dataframe - a powerful data object to store mixed type data with labels\n", "pd.DataFrame?" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "# There are several ways to initialize a dataframe\n", "# Here, we provide a dictionary made up of our series\n", "df = pd.DataFrame(data={'Col-A': ser_1, 'Col-B':ser_2}, index=labels)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "London 2\n", "Washington 1\n", "Budapest 1\n", "Name: Col-A, dtype: int64" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# For categorical data, we can check how many of each value there are\n", "df['Col-A'].value_counts()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pandas.core.series.Series" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Note that dataframes are actually collections of Series\n", "# When we index the df, as above, we actually pull out a Series\n", "# So, the '.value_counts()' is actually a Series method\n", "type(df['Col-A'])" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Col-B 55.5\n", "dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Pandas also gives us tons an ways to directly explore and analyze data in dataframes\n", "# For example, the mean for all numberic data columns\n", "df.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "For more practice with pandas, you can try some collections of exercises, including\n", "this one\n", "and\n", " this one.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Matplotlib is a library for plotting, in particular for 2D plots.\n", "
\n", "\n", "
\n", "The official numpy \n", "website\n", "includes the official\n", "tutorial\n", "as well as a \n", "gallery\n", "of examples that you can start from and modify.\n", "
" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# This magic command is used to plot all figures inline in the notebook\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "# Matplotlib is standardly imported as plt\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3dd5hUhdn+8e9D732RuvTeFBcQMfZEQZSg5BWjxpYQjca8yS8KdsWGmqKJUYIaY4slLE0FW+xdQNlCZ2lLrwtsYdvz+2PGvOu6sLMwu2d29v5cFxez5xx2bsfDzdkzc55j7o6IiFR/tYIOICIi0aFCFxGJEyp0EZE4oUIXEYkTKnQRkThRJ6gnbtOmjXft2jWopxcRqZYWLVq0090TyloXWKF37dqVhQsXBvX0IiLVkpmtP9Q6nXIREYkTKnQRkTihQhcRiRMqdBGROKFCFxGJExEVupn91szSzSzNzF40swal1tc3s5fNbLWZfWFmXSsjrIiIHFq5hW5mHYHrgSR3HwjUBiaW2uwqYI+79wT+DDwQ7aAiInJ4kZ5yqQM0NLM6QCNgc6n144Bnwo9nAmeYmUUnoohIfCgoKuax91ezZOPeSvn+5Ra6u28C/gBsALYAWe7+VqnNOgIbw9sXAllA69Lfy8wmmdlCM1u4Y8eOo80uIlJtpG3K4sd/+4QH31jBgrStlfIckZxyaUnoCLwb0AFobGaXlN6sjD/6vTtnuPsMd09y96SEhDKvXBURiSt5BUU89OZyxv3tE7btO8jjFw9lyui+lfJckVz6fyaw1t13AJjZLOBE4PkS22QCnYHM8GmZ5sDuKGcVEalWFq7bzY3JKWTsyOYnx3fi1nP607xR3Up7vkgKfQNwgpk1AnKBM4DSQ1jmAZcBnwETgHdd97YTkRrqwMFCHnpjOc9+vp4OzRvy7JXDObl35Z+VKLfQ3f0LM5sJLAYKga+BGWY2FVjo7vOAp4DnzGw1oSPz0p+CERGpET5YuYObZ6WyOSuXy0Z25Yaz+tC4ftXMQbSgDqSTkpJc0xZFJF7szcnn7teWkbw4kx4JjXnggsEkdW0V9ecxs0XunlTWusDG54qIxIsFqVu4bW46e3Lyue60nlx3ek8a1K1d5TlU6CIiR2j7vjxun5vOG+lbGdixGc9cOYwBHZoHlkeFLiJSQe7Ovxdlcs9rS8krLGby2X35xQ+6Uad2sOOxVOgiIhWwcXcON89O5aNVOxnetRXTLhhE94QmQccCVOgiIhEpKnae/WwdD725AgPuHjeAi0d0oVat2JlyokIXESnH6u37mZycyqL1ezildwL3nT+Iji0aBh3re1ToIiKHUFBUzN8/WMNf/rOaRvVr8+cLh/DjYzsSq7MHVegiImVIzczihplLWL51P+cMbs9d5w2gTZP6Qcc6LBW6iEgJeQVFPPzOKp74KIPWjevx90uP56wB7YKOFREVuohI2BcZu5gyK5W1O7O5MKkzN5/Tj+YNK2+YVrSp0EWkxtufV8CDb6zguc/X07lVQ174+QhG9WwTdKwKU6GLSI323ort3DIrlS378rhyVDd+f1ZvGtWrntVYPVOLiByl3dn53P3aUmZ/vYlebZuQfM2JDE1sGXSso6JCF5Eaxd15PXULd8xNJyu3gOvP6MW1p/Wgfp2qH6YVbSp0Eakxtu3L49Y5aby9dBuDOzXn+Z+PoF/7ZkHHihoVuojEPXfnlYUbuef1ZeQXFnPzmL5cOSr4YVrRVm6hm1kf4OUSi7oDt7v7wyW2ORWYC6wNL5rl7lOjmFNE5Ihs2JXDlFkpfLpmFyO6teKBCwbTtU3joGNVikhuQbcCOBbAzGoDm4DZZWz6kbuPjW48EZEjU1TsPP3JWv741kpq1zLuHT+Qi4YlxtQwrWir6CmXM4A17r6+MsKIiETDym37uXFmCt9s3Mvpfdty7/iBtG8ee8O0oq2ihT4RePEQ60aa2RJgM/B7d08vvYGZTQImASQmJlbwqUVEDi+/sJjH31/Do++tommDujwy8VjOG9IhZodpRVvEN4k2s3qEynqAu28rta4ZUOzuB8xsDPCIu/c63PfTTaJFJJqWbNzL5OQUlm/dz3lDOnDHuf1pHePDtI5EtG4SPRpYXLrMAdx9X4nH883sMTNr4+47Kx5XRCRyuflF/PmdlTz5UQZtmzbgyZ8lcWb/Y4KOFYiKFPpFHOJ0i5m1A7a5u5vZcKAWsCsK+UREDumzNbuYMiuF9btyuGh4IjeN6UuzBtVnmFa0RVToZtYI+CHwyxLLrgZw9+nABOAaMysEcoGJHum5HBGRCtqXV8D985fz4pcb6NK6Ef/6xQhO7FH9hmlFW0SF7u45QOtSy6aXePwo8Gh0o4mIfN9/lm3jltlpbN+fxy9+0I3f/bAPDetV/8v2o0FXiopItbDrwEHuenUp85Zsps8xTZl+6fEc27lF0LFiigpdRGKauzNvyWbuenUp+/MK+O2Zvbnm1B7UqxNfl+1HgwpdRGLWlqxcbp2dxn+Wb2dI5xY8eMFg+rRrGnSsmKVCF5GYU1zsvPTVRu6fv4yC4mJuPacfV4zqRu04vmw/GlToIhJT1u3MZsqsFD7P2M3I7q2ZdsEgurSOz2Fa0aZCF5GYUFhUzD/Cw7Tq1a7FtPMHceGwzjXmsv1oUKGLSOCWb93H5JkpLMnM4sx+x3DPjwfSrnmDoGNVOyp0EQnMwcIi/vbeGh57bzXNG9blrxcdx9jB7XVUfoRU6CISiK837GFycgortx1g/HEduW1sf1o1rhd0rGpNhS4iVSonv5A/vrWSf3yylnbNGvCPy5M4vW/NHKYVbSp0Eakyn6zeyZRZKWzcncslJyQy+ey+NK3Bw7SiTYUuIpUuK7eA++cv46WvNtKtTWNemnQCJ3RvXf4flApRoYtIpXorfSu3zklj54GD/PKU7vz2zN40qKthWpVBhS4ilWLngYPcOS+d11K20LddU568LInBnTRMqzKp0EUkqtydOd9s4q5Xl5JzsIj/98PeXH1qD+rW1jCtyqZCF5Go2bQ3l1tmp/L+ih0clxgaptXrGA3TqirlFrqZ9QFeLrGoO3C7uz9cYhsDHgHGADnA5e6+OMpZRSRGFRc7L3y5gWnzl1HscMe5/fnZyK4aplXFyi10d18BHAtgZrWBTcDsUpuNBnqFf40AHg//LiJxLmPHAaYkp/Llut2c1LMN958/iM6tGgUdq0aq6CmXM4A17r6+1PJxwLPh+4h+bmYtzKy9u2+JSkoRiTmFRcU8+fFa/vz2SurXqcWDEwbzk+M76bL9AFW00CcCL5axvCOwscTXmeFl3yl0M5sETAJITEys4FOLSKxYunkfNyYvIW3TPs4acAx3jxtI22YaphW0iAvdzOoB5wE3lbW6jGX+vQXuM4AZAElJSd9bLyKxLa+giEffXc30D9bQolFdHrt4KKMHttNReYyoyBH6aGCxu28rY10m0LnE152AzUcTTERiy6L1u7lxZgprdmRz/tCO3HZOf1pqmFZMqUihX0TZp1sA5gHXmdlLhN4MzdL5c5H4kH2wkIfeXMEzn62jQ/OG/POKYZzap23QsaQMERW6mTUCfgj8ssSyqwHcfTown9BHFlcT+tjiFVFPKiJV7qNVO7hpViqZe3K5bGQXbji7L03q6/KVWBXR/xl3zwFal1o2vcRjB66NbjQRCUpWTgH3vL6Ufy/KpHtCY/599UiGdW0VdCwph/6pFZHveCNtC7fNTWd3dj6/OrUH15/RS8O0qgkVuogAsH1/HnfMTWdB2lb6t2/G05cPY2DH5kHHkgpQoYvUcO5O8uJN3P3aUnILirjhrD5MOrm7hmlVQyp0kRosc08ON89O48OVO0jq0pJpFwymZ9smQceSI6RCF6mBioud5z5fzwNvLAfgrvMGcOkJXailYVrVmgpdpIZZvf0AU5JTWLh+Dyf3TuC+8QPp1FLDtOKBCl2khigoKmbGhxk88s4qGtarzR9+MoQLhnbUZftxRIUuUgOkbcrixpkpLN2yjzGD2nHneQNo21TDtOKNCl0kjuUVFPHIf1Yx48MMWjaqx/RLhnL2wPZBx5JKokIXiVNfrdvN5JkpZOzM5ifHd+LWc/rTvFHdoGNJJVKhi8SZAwcLefCN5Tz72Xo6tWzIc1cN5we9EoKOJVVAhS4SR95fsZ1bZqexOSuXy0/syg1n9aGxhmnVGPo/LRIH9mTnc/frS5m1eBM9Ehoz8+qRHN9Fw7RqGhW6SDXm7ixI28rtc9PYm1PAdaf15LrTe2qYVg2lQhepprbvy+O2uWm8mb6NgR2b8cyVwxnQQcO0arJIb3DRAngSGEjoXqFXuvtnJdafCswF1oYXzXL3qdGNKiIQOir/98JM7nl9KQcLi5kyui8/P6kbdTRMq8aL9Aj9EeANd58Qvll0WdcJf+TuY6MXTURK27g7h5tmpfLx6p0M79qKaRcMonuChmlJSLmFbmbNgJOBywHcPR/Ir9xYIlJSUbHzzKfreOjNFdQyuPvHA7l4eKKGacl3RHKE3h3YATxtZkOARcBv3D271HYjzWwJsBn4vbunl/5GZjYJmASQmJh4VMFFaopV2/YzOTmFxRv2cmqfBO4dP4iOLRoGHUtiUCQn3eoAQ4HH3f04IBuYUmqbxUAXdx8C/BWYU9Y3cvcZ7p7k7kkJCbrQQeRwCoqK+et/VnHOXz4mY2c2f75wCE9fPkxlLocUyRF6JpDp7l+Ev55JqUJ3930lHs83s8fMrI2774xeVJGaIzUzixtmLmH51v2cM7g9d503gDZN6gcdS2JcuYXu7lvNbKOZ9XH3FcAZwNKS25hZO2Cbu7uZDSd05L+rUhKLxLG8giL+/M5KnvgwgzZN6vP3S4/nrAHtgo4l1USkn3L5NfBC+BMuGcAVZnY1gLtPByYA15hZIZALTHR3r4zAIvHq84xdTElOYd2uHCYO68xNY/rRvKGGaUnkIip0d/8GSCq1eHqJ9Y8Cj0Yxl0iNsT+vgGkLlvPCFxvo3KohL/x8BKN6tgk6llRDulJUJEDvLd/OzbNT2bovj6tO6sb/+1FvGtXTX0s5MtpzRAKwOzufqa+mM+ebzfRq24Tka05kaGLLoGNJNadCF6lC7s5rKVu4c146WbkFXH9GL649rQf162iYlhw9FbpIFdmalcetc9J4Z9k2Bndqzgu/GEHfds2CjiVxRIUuUsncnZe+2sh9ry8jv6iYW8b044pRXTVMS6JOhS5SidbvymZKciqfZexiRLdWPHDBYLq2aRx0LIlTKnSRSlBU7Dz9yVr+8NYK6tSqxX3jBzFxWGcN05JKpUIXibIVW/dzY3IKSzbu5fS+bbl3/EDaN9f8Fal8KnSRKMkvLOax91fzt/dW07RBXR6ZeCznDemAmY7KpWqo0EWi4JuNe5k8M4UV2/Yz7tgO3D62P601TEuqmApd5Cjk5hfxp7dX8NTHa2nbtAFP/iyJM/sfE3QsqaFU6CJH6NM1O5mSnMqG3Tn8dEQiU0b3pVkDDdOS4KjQRSpoX14B989fzotfbqBL60b86xcjOLGHhmlJ8FToIhXwztJt3DInlR37DzLp5O789szeNKyny/YlNqjQRSKw68BB7nx1Ka8u2Uzfdk2ZcWkSQzq3CDqWyHeo0EUOw92Zt2Qzd85L58DBQn57Zm+uObUH9erosn2JPREVupm1AJ4EBgIOXOnun5VYb8AjwBggB7jc3RdHP65I1dm8N5db56Tx7vLtHNu5BQ9OGEzvY5oGHUvkkCI9Qn8EeMPdJ4RvQ9eo1PrRQK/wrxHA4+HfRaqd4mLnxa82cP/85RQWF3PrOf24YlQ3auuyfYlx5Ra6mTUDTgYuB3D3fCC/1GbjgGfD9xH93MxamFl7d98S5bwilWrtzmymJKfwxdrdnNijNdPOH0xi69LHLyKxKZIj9O7ADuBpMxsCLAJ+4+7ZJbbpCGws8XVmeNl3Ct3MJgGTABITE48itkh0FRYV89THa/nT2yupV7sW084fxIXDOuuyfalWInlnpw4wFHjc3Y8DsoEppbYpa6/37y1wn+HuSe6elJCQUOGwIpVh2ZZ9nP/4p9y/YDk/6JXA2787hYnDE1XmUu1EcoSeCWS6+xfhr2fy/ULPBDqX+LoTsPno44lUnoOFRfzt3dU89v4amjesy6M/PY5zBrVXkUu1VW6hu/tWM9toZn3cfQVwBrC01GbzgOvM7CVCb4Zm6fy5xLLFG/YweWYKq7YfYPxxHbl9bH9aNq4XdCyRoxLpp1x+DbwQ/oRLBnCFmV0N4O7TgfmEPrK4mtDHFq+ohKwiRy0nv5A/vLmSpz9dS7tmDXj68mGc1rdt0LFEoiKiQnf3b4CkUounl1jvwLVRzCUSdR+v2slNs1PYuDuXS05IZPLZfWmqYVoSR3SlqMS9rNwC7n19Ka8szKRbm8a8POkERnRvHXQskahToUtcezN9K7fNSWNXdj5Xn9KD/z2zFw3qapiWxCcVusSlHfsPcue8dF5P3UK/9s146rJhDOrUPOhYIpVKhS5xxd2Z/fUmpr62lJyDRfz+R7355Sk9qFtbw7Qk/qnQJW5s2pvLLbNTeX/FDoYmhoZp9WyrYVpSc6jQpdorLnZe+GI90xYsp9jhjnP787ORXTVMS2ocFbpUa2t2HGBKcgpfrdvDD3q14b7xg+jcSsO0pGZSoUu1VFhUzIyPMnj4nVU0qFOLhyYMZsLxnXTZvtRoKnSpdtI3ZzE5OYW0Tfs4a8Ax3D1uIG2bNQg6lkjgVOhSbeQVFPHXd1cx/YMMWjaqx+MXD2X0oPZBxxKJGSp0qRYWrtvN5OQU1uzI5oKhnbhtbD9aNNIwLZGSVOgS07IPFvLQmyt45rN1dGjekGeuHM4pvTVLX6QsKnSJWR+u3MFNs1LZnJXLz07owg1n96VJfe2yIoeivx0Sc/bm5HPP68uYuSiT7gmNeeWXIxnWtVXQsURingpdYsqC1C3cNjedPTn5/OrUHlx/hoZpiURKhS4xYfv+PO6Ym86CtK30b9+Mf14xjIEdNUxLpCIiKnQzWwfsB4qAQndPKrX+VGAusDa8aJa7T41eTIlX7s7MRZnc8/oycguKuOGsPkw6ubuGaYkcgYocoZ/m7jsPs/4jdx97tIGk5ti4O4ebZ6fy0aqdJHVpybQLBtOzbZOgY4lUWzrlIlWuuNh59rN1PPjmCgyYOm4Al4zoQi0N0xI5KpEWugNvmZkDf3f3GWVsM9LMlgCbgd+7e3rpDcxsEjAJIDEx8QgjS3W2evt+Jiensmj9Hk7uncB94wfSqaWGaYlEQ6SFPsrdN5tZW+BtM1vu7h+WWL8Y6OLuB8xsDDAH6FX6m4T/IZgBkJSU5EeZXaqRgqJiZnyYwSPvrKJhvdr88SdDOH9oRw3TEomiiArd3TeHf99uZrOB4cCHJdbvK/F4vpk9ZmZtyjnnLjVE2qYsbpyZwtIt+xgzqB13nTeQhKb1g44lEnfKLXQzawzUcvf94cc/AqaW2qYdsM3d3cyGA7WAXZURWKqPvIIiHvnPKmZ8mEGrxvWYfslQzh6oYVoilSWSI/RjgNnhH43rAP9y9zfM7GoAd58OTACuMbNCIBeY6O46pVKDfbl2N1OSU8jYmc3/JHXiljH9ad6obtCxROJauYXu7hnAkDKWTy/x+FHg0ehGk+rowMFCHliwnOc+X0+nlg15/qoRnNSrTdCxRGoEfWxRoua9Fdu5ZVYqW/blccWorvz+R31orGFaIlVGf9vkqO3Jzufu15Yy6+tN9GzbhJlXn8jxXVoGHUukxlGhyxFzd+anbuWOeWnszSng16f35LrTe1K/joZpiQRBhS5HZNu+PG6bk8ZbS7cxqGNznr1yBP07NAs6lkiNpkKXCnF3Xlm4kXteX0Z+YTE3je7LVSd1o46GaYkEToUuEduwK4ebZqfwyepdDO/WimnnD6J7goZpicQKFbqUq6jY+een6/jDmyuoXcu458cD+enwRA3TEokxKnQ5rFXb9nNjcgpfb9jLqX0SuG/8IDq0aBh0LBEpgwpdypRfWMz0D9bw6LuraVy/Ng9feCzjju2gYVoiMUyFLt+TkrmXG2emsHzrfs4d0oE7zu1PmyYapiUS61To8l+5+UU8/M5Knvgog4Sm9XniZ0n8sP8xQccSkQip0AWAzzN2MSU5hXW7crhoeGemjO5H84YapiVSnajQa7j9eQVMW7CcF77YQGKrRvzr5yM4saeGaYlURyr0Guzd5du4ZXYa2/bl8fOTuvG7H/WmUT3tEiLVlf721kC7s/OZ+mo6c77ZTK+2TXjsmhM5LlHDtESqu4gK3czWAfuBIqDQ3ZNKrTfgEWAMkANc7u6LoxtVjpa782rKFu6cl87+vAJ+c0YvfnVaDw3TEokTFTlCP+0w9wgdTeim0L2AEcDj4d8lRmzNyuPWOam8s2w7Qzo154EJI+jbTsO0ROJJtE65jAOeDd927nMza2Fm7d19S5S+vxwhd+elrzZy3+vLKCgu5pYx/bjypG7U1mX7InEn0kJ34C0zc+Dv7j6j1PqOwMYSX2eGl32n0M1sEjAJIDEx8YgCS+TW78pmSnIqn2Xs4oTurZh2/mC6tmkcdCwRqSSRFvood99sZm2Bt81subt/WGJ9WYd737tJdPgfghkASUlJuol0JSkqdp7+ZC1/eGsFdWvV4r7xg5g4rLOGaYnEuYgK3d03h3/fbmazgeFAyULPBDqX+LoTsDlaISVyK7aGhmkt2biXM/q25Z7xA2nfXMO0RGqCcgvdzBoDtdx9f/jxj4CppTabB1xnZi8RejM0S+fPq1Z+YTF/e281j72/mqYN6vKXi47j3MHtNUxLpAaJ5Aj9GGB2uBjqAP9y9zfM7GoAd58OzCf0kcXVhD62eEXlxJWyfLNxLzfOXMLKbQcYd2wH7jh3AK0a1ws6lohUsXIL3d0zgCFlLJ9e4rED10Y3mpQnN7+IP761gn98spa2TRvw1GVJnNFPw7REaipdKVpNfbpmJ1OSU9mwO4efjkhkyui+NGugYVoiNZkKvZrZl1fA/fOX8eKXG+nSuhEv/uIERvZoHXQsEYkBKvRq5O2l27h1Tio79h/klyd353/P7E3DerpsX0RCVOjVwM4DB7lzXjqvpWyhb7umPPGzJAZ3ahF0LBGJMSr0GObuzP1mM3e9ms6Bg4X87oe9ufqUHtSrUyvoaCISg1ToMWrz3lxunZPGu8u3c2znFjw4YTC9j2kadCwRiWEq9BhTXOz868sNTFuwnKJi57ax/bn8xK4apiUi5VKhx5C1O7OZkpzCF2t3M6pna+4fP5jE1o2CjiUi1YQKPQYUFhXz1Mdr+dPbK6lXpxYPXDCI/0nqrMv2RaRCVOgBW7p5H5OTU0jdlMUP+x/DPT8eyDHNGgQdS0SqIRV6QA4WFvHou6t5/P01tGhUl7/9dChjBrXTUbmIHDEVegAWrd/D5OQUVm8/wPnHdeS2sf1pqWFaInKUVOhVKCe/kIfeXME/P11H+2YNePqKYZzWp23QsUQkTqjQq8jHq3YyZVYKmXtyufSELtx4dh+aapiWiESRCr2SZeUUcO/8pbyyMJNubRrz8qQTGNFdw7REJPpU6JXojbSt3DY3jd3Z+Vxzag9+c0YvGtTVMC0RqRwRF7qZ1QYWApvcfWypdZcDDwGbwosedfcnoxWyutmxPzRM6/XULfRr34x/XDaMQZ2aBx1LROJcRY7QfwMsA5odYv3L7n7d0UeqvtydWYs3MfW1peTmF3HDWX2YdHJ36tbWMC0RqXwRFbqZdQLOAe4FflepiaqpTXtzuXlWKh+s3MHQxNAwrZ5tNUxLRKpOpEfoDwM3AodrqAvM7GRgJfBbd99YegMzmwRMAkhMTKxg1NhUXOw8/8V6HliwHAfuPLc/l47UMC0RqXrlngsws7HAdndfdJjNXgW6uvtg4B3gmbI2cvcZ7p7k7kkJCQlHFDiWrNlxgAtnfMbtc9MZ2qUlb/7vyVw+qpvKXEQCEckR+ijgPDMbAzQAmpnZ8+5+ybcbuPuuEts/ATwQ3ZixpaComCc+yuDhd1bRoE4tHpowmAnHd9Jl+yISqHIL3d1vAm4CMLNTgd+XLPPw8vbuviX85XmE3jyNS2mbspicnEL65n2cPaAdU388gLZNNUxLRIJ3xJ9DN7OpwEJ3nwdcb2bnAYXAbuDy6MSLHXkFRfz13VVM/yCDlo3q8fjFQxk9qH3QsURE/svcPZAnTkpK8oULFwby3BW1cN1ubkxOIWNHNhcM7cRtY/vRopGGaYlI1TOzRe6eVNY6XSl6GNkHQ8O0nvlsHR2aN+SZK4dzSu/q/2auiMQnFfohfLByBzfPSmVzVi6XjezKDWf1oXF9vVwiErvUUKXszcnn7teWkbw4k+4Jjfn3L0eS1LVV0LFERMqlQi9hQeoWbpubzp6cfK49rQe/Pl3DtESk+lChA9v35XH73HTeSN/KgA7NeObKYQzooGFaIlK91OhCd3dmLsrk7teWkldYzI1n9+EXP9AwLRGpnmpsoW/cncPNs1P5aNVOhnVtybQLBtMjoUnQsUREjliNK/SiYufZz9bx0JsrMODucQO4eEQXamn+iohUczWq0Fdv38/k5FQWrd/DKb0TuHf8QDq1bBR0LBGRqKgRhV5QVMzfP1jDX/6zmkb1a/On/xnC+OM6apiWiMSVuC/0tE1Z3DAzhWVb9nHOoPbced4AEprWDzqWiEjUxW2h5xUU8fA7q3jiowxaNa7H9EuO5+yB7YKOJSJSaeKy0L9cu5spySlk7MzmwqTO3DymH80b1Q06lohIpYqrQt+fV8CDb6zguc/X06llQ56/agQn9WoTdCwRkSoRN4X+3ort3DIrlS378rhyVDd+f1ZvGtWLm/88EZFyVfvG25Odz92vLWXW15vo2bYJM68+keO7tAw6lohIlYu40M2sNrAQ2OTuY0utqw88CxwP7AIudPd1Ucz5Pe7O66lbuGNuOlm5BVx/ek+uPb0n9etomJaI1EwVOUL/DaF7hTYrY91VwB5372lmEwndJPrCKOQr07Z9edw2J423lm5jUMfmPP/zEfRrX1YsEZGaI6JCN7NOwDnAvcDvythkHHBn+PFM4FEzMxPGIC0AAAZbSURBVK+E+9u9t3w717/0NfmFxdw0ui9XndSNOhqmJSIS8RH6w8CNQNNDrO8IbARw90IzywJaAztLbmRmk4BJAImJiUeSl25tGjM0sSV3njeAbm0aH9H3EBGJR+Ue2prZWGC7uy863GZlLPve0bm7z3D3JHdPSkg4sntzdm3TmGeuHK4yFxEpJZJzFaOA88xsHfAScLqZPV9qm0ygM4CZ1QGaA7ujmFNERMpRbqG7+03u3snduwITgXfd/ZJSm80DLgs/nhDeJurnz0VE5NCO+HPoZjYVWOju84CngOfMbDWhI/OJUconIiIRqlChu/v7wPvhx7eXWJ4H/CSawUREpGL0eT8RkTihQhcRiRMqdBGROKFCFxGJExbUpwvNbAew/gj/eBtKXYUaI2I1F8RuNuWqGOWqmHjM1cXdy7wyM7BCPxpmttDdk4LOUVqs5oLYzaZcFaNcFVPTcumUi4hInFChi4jEiepa6DOCDnAIsZoLYjebclWMclVMjcpVLc+hi4jI91XXI3QRESlFhS4iEidiqtDN7B9mtt3M0g6x3szsL2a22sxSzGxoiXWXmdmq8K/LyvrzlZjr4nCeFDP71MyGlFi3zsxSzewbM1sYzVwRZjvVzLLCz/+Nmd1eYt3ZZrYi/HpOqcJMN5TIk2ZmRWbWKryu0l4vM+tsZu+Z2TIzSzez35SxTZXvYxHmqvJ9LMJcQexfkeQKah9rYGZfmtmScLa7ytimvpm9HH5dvjCzriXW3RRevsLMzqpwAHePmV/AycBQIO0Q68cACwjdIekE4Ivw8lZARvj3luHHLasw14nfPh8w+ttc4a/XAW0CfM1OBV4rY3ltYA3QHagHLAH6V0WmUtueS2h+fqW/XkB7YGj4cVNgZen/5iD2sQhzVfk+FmGuIPavcnMFuI8Z0CT8uC7wBXBCqW1+BUwPP54IvBx+3D/8OtUHuoVfv9oVef6YOkJ39w85/J2OxgHPesjnQAszaw+cBbzt7rvdfQ/wNnB2VeVy90/DzwvwOdApWs9dnghes0MZDqx29wx3zyd0N6pxAWS6CHgxGs9bHnff4u6Lw4/3A8sI3Q+3pCrfxyLJFcQ+FuHrdSiVuX9VNFdV7mPu7gfCX9YN/yr9yZNxwDPhxzOBM8zMwstfcveD7r4WWE3odYxYTBV6BP57M+qwzPCyQy0PwlWEjvC+5cBbZrbIQjfJDsLI8I+AC8xsQHhZ4K+ZmTUiVIrJJRZXyesV/jH3OEJHUCUFuo8dJldJVb6PlZMrsP2rvNcriH3MzGqb2TfAdkIHAYfcx9y9EMgCWhOF1+yI71gUkEPdjDqim1RXNjM7jdBftpNKLB7l7pvNrC3wtpktDx/BVpXFhGY/HDCzMcAcoBex8ZqdC3zi7iWP5iv99TKzJoT+gv+vu+8rvbqMP1Il+1g5ub7dpsr3sXJyBbZ/RfJ6EcA+5u5FwLFm1gKYbWYD3b3k+0mVto9VtyP0/96MOqwTsPkwy6uMmQ0GngTGufuub5e7++bw79uB2VTwR6ij5e77vv0R0N3nA3XNrA0x8JoROn/4nR+FK/v1MrO6hErgBXefVcYmgexjEeQKZB8rL1dQ+1ckr1dYle9jJZ5nL6E7vJU+Nfff18bM6gDNCZ2iPPrXrDLeGDiaX0BXDv0G3zl89w2rL8PLWwFrCb1Z1TL8uFUV5kokdL7rxFLLGwNNSzz+FDi7il+zdvzfBWTDgQ3h168OoTf2uvF/b1oNqIpM4fXf7sSNq+r1Cv93Pws8fJhtqnwfizBXle9jEeaq8v0rklwB7mMJQIvw44bAR8DYUttcy3ffFH0l/HgA331TNIMKvikaU6dczOxFQu+atzGzTOAOQm8q4O7TgfmEPoWwGsgBrgiv221mdwNfhb/VVP/uj1iVnet2QufAHgu9t0GhhyapHUPoRy4I7eD/cvc3opUrwmwTgGvMrBDIBSZ6aO8pNLPrgDcJfSLhH+6eXkWZAMYDb7l7dok/Wtmv1yjgUiA1fI4T4GZCZRnkPhZJriD2sUhyVfn+FWEuCGYfaw88Y2a1CZ0BecXdXzOzqcBCd58HPAU8Z2arCf2DMzGcO93MXgGWAoXAtR46fRMxXfovIhInqts5dBEROQQVuohInFChi4jECRW6iEicUKGLiMQJFbqISJxQoYuIxIn/D0X7ZZ2WZz5TAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Plot a basic line graph\n", "plt.plot([1, 2, 3], [4, 6, 8])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "There are also many external materials for using matplotlib, including\n", "this one.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "
\n", "\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "Scikit-Learn is a packages for data mining, data analysis, and machine learning. \n", "
\n", "\n", "
\n", "Here is the official scikit-learn\n", "website\n", "including their official\n", "tutorial.\n", "
" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "# Import sklearn\n", "import sklearn as skl" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Check out module description\n", "skl?" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "We will get into machine learning and working with sklearn later on in the tutorials." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## External Resources\n", "\n", "There are many, many resources to learn how to use those packages. \n", "\n", "The links above include the official documentation and tutorials, which are the best place to start.\n", "\n", "You can also search google for other resources and exercises." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "A particularly good (and free) resource, covering all these tools is the\n", "Data Science Handbook \n", "by\n", "Jake Vanderplas.\n", "
" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 1 }