{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "**Tools - pandas**\n", "\n", "*The `pandas` library provides high-performance, easy-to-use data structures and data analysis tools. The main data structure is the `DataFrame`, which you can think of as an in-memory 2D table (like a spreadsheet, with column names and row labels). Many features available in Excel are available programmatically, such as creating pivot tables, computing columns based on other columns, plotting graphs, etc. You can also group rows by column value, or join tables much like in SQL. Pandas is also great at handling time series.*\n", "\n", "Prerequisites:\n", "* NumPy – if you are not familiar with NumPy, we recommend that you go through the [NumPy tutorial](tools_numpy.ipynb) now." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " \n", "
\n", " \"Open\n", " \n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's import `pandas`. People usually import it as `pd`:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# `Series` objects\n", "The `pandas` library contains these useful data structures:\n", "* `Series` objects, that we will discuss now. A `Series` object is 1D array, similar to a column in a spreadsheet (with a column name and row labels).\n", "* `DataFrame` objects. This is a 2D table, similar to a spreadsheet (with column names and row labels).\n", "* `Panel` objects. You can see a `Panel` as a dictionary of `DataFrame`s. These are less used, so we will not discuss them here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a `Series`\n", "Let's start by creating our first `Series` object!" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 2\n", "1 -1\n", "2 3\n", "3 5\n", "dtype: int64" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = pd.Series([2,-1,3,5])\n", "s" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Similar to a 1D `ndarray`\n", "`Series` objects behave much like one-dimensional NumPy `ndarray`s, and you can often pass them as parameters to NumPy functions:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 7.389056\n", "1 0.367879\n", "2 20.085537\n", "3 148.413159\n", "dtype: float64" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "np.exp(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Arithmetic operations on `Series` are also possible, and they apply *elementwise*, just like for `ndarray`s:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1002\n", "1 1999\n", "2 3003\n", "3 4005\n", "dtype: int64" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s + [1000,2000,3000,4000]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to NumPy, if you add a single number to a `Series`, that number is added to all items in the `Series`. This is called * broadcasting*:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1002\n", "1 999\n", "2 1003\n", "3 1005\n", "dtype: int64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s + 1000" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same is true for all binary operations such as `*` or `/`, and even conditional operations:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 False\n", "1 True\n", "2 False\n", "3 False\n", "dtype: bool" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s < 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Index labels\n", "Each item in a `Series` object has a unique identifier called the *index label*. By default, it is simply the rank of the item in the `Series` (starting at `0`) but you can also set the index labels manually:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice 68\n", "bob 83\n", "charles 112\n", "darwin 68\n", "dtype: int64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2 = pd.Series([68, 83, 112, 68], index=[\"alice\", \"bob\", \"charles\", \"darwin\"])\n", "s2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can then use the `Series` just like a `dict`:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "83" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2[\"bob\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can still access the items by integer location, like in a regular array:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "83" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2[1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make it clear when you are accessing by label or by integer location, it is recommended to always use the `loc` attribute when accessing by label, and the `iloc` attribute when accessing by integer location:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "83" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2.loc[\"bob\"]" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "83" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2.iloc[1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Slicing a `Series` also slices the index labels:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bob 83\n", "charles 112\n", "dtype: int64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s2.iloc[1:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can lead to unexpected results when using the default numeric labels, so be careful:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1000\n", "1 1001\n", "2 1002\n", "3 1003\n", "dtype: int64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "surprise = pd.Series([1000, 1001, 1002, 1003])\n", "surprise" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2 1002\n", "3 1003\n", "dtype: int64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "surprise_slice = surprise[2:]\n", "surprise_slice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oh look! The first element has index label `2`. The element with index label `0` is absent from the slice:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Key error: 0\n" ] } ], "source": [ "try:\n", " surprise_slice[0]\n", "except KeyError as e:\n", " print(\"Key error:\", e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But remember that you can access elements by integer location using the `iloc` attribute. This illustrates another reason why it's always better to use `loc` and `iloc` to access `Series` objects:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1002" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "surprise_slice.iloc[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Init from `dict`\n", "You can create a `Series` object from a `dict`. The keys will be used as index labels:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice 68\n", "bob 83\n", "colin 86\n", "darwin 68\n", "dtype: int64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "weights = {\"alice\": 68, \"bob\": 83, \"colin\": 86, \"darwin\": 68}\n", "s3 = pd.Series(weights)\n", "s3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can control which elements you want to include in the `Series` and in what order by explicitly specifying the desired `index`:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "colin 86\n", "alice 68\n", "dtype: int64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s4 = pd.Series(weights, index = [\"colin\", \"alice\"])\n", "s4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Automatic alignment\n", "When an operation involves multiple `Series` objects, `pandas` automatically aligns items by matching index labels." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['alice', 'bob', 'charles', 'darwin'], dtype='object')\n", "Index(['alice', 'bob', 'colin', 'darwin'], dtype='object')\n" ] }, { "data": { "text/plain": [ "alice 136.0\n", "bob 166.0\n", "charles NaN\n", "colin NaN\n", "darwin 136.0\n", "dtype: float64" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(s2.keys())\n", "print(s3.keys())\n", "\n", "s2 + s3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The resulting `Series` contains the union of index labels from `s2` and `s3`. Since `\"colin\"` is missing from `s2` and `\"charles\"` is missing from `s3`, these items have a `NaN` result value. (ie. Not-a-Number means *missing*).\n", "\n", "Automatic alignment is very handy when working with data that may come from various sources with varying structure and missing items. But if you forget to set the right index labels, you can have surprising results:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "s2 = [ 68 83 112 68]\n", "s5 = [1000 1000 1000 1000]\n" ] }, { "data": { "text/plain": [ "alice NaN\n", "bob NaN\n", "charles NaN\n", "darwin NaN\n", "0 NaN\n", "1 NaN\n", "2 NaN\n", "3 NaN\n", "dtype: float64" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s5 = pd.Series([1000,1000,1000,1000])\n", "print(\"s2 =\", s2.values)\n", "print(\"s5 =\", s5.values)\n", "\n", "s2 + s5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas could not align the `Series`, since their labels do not match at all, hence the full `NaN` result." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Init with a scalar\n", "You can also initialize a `Series` object using a scalar and a list of index labels: all items will be set to the scalar." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "life 42\n", "universe 42\n", "everything 42\n", "dtype: int64" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "meaning = pd.Series(42, [\"life\", \"universe\", \"everything\"])\n", "meaning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `Series` name\n", "A `Series` can have a `name`:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bob 83\n", "alice 68\n", "Name: weights, dtype: int64" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s6 = pd.Series([83, 68], index=[\"bob\", \"alice\"], name=\"weights\")\n", "s6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting a `Series`\n", "Pandas makes it easy to plot `Series` data using matplotlib (for more details on matplotlib, check out the [matplotlib tutorial](tools_matplotlib.ipynb)). Just import matplotlib and call the `plot()` method:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xd8VfX9x/HX594sEiAhEMIIEPae\nCQFcVdyKYB3g1lZErdZV22qHq63Vah392VZxFcQBrjrqHnWyEkBkgyRhh0AYgZD9/f2RW3+UH8pN\nSHLueD8fjzxyx8k57/sQ3zk553u+x5xziIhIZPF5HUBERBqfyl1EJAKp3EVEIpDKXUQkAqncRUQi\nkMpdRCQCqdxFRCKQyl1EJAKp3EVEIlCMVxtu166dy8zM9GrzIiJhKS8vb5tzLu1Qy3lW7pmZmeTm\n5nq1eRGRsGRmhcEsp8MyIiIRSOUuIhKBVO4iIhFI5S4iEoFU7iIiEUjlLiISgVTuIiIRyLNx7tL0\ndpdXkVewg5VFpfRNb0VWZhtaJ8R6HUtEmoHKPYJs21PB/PwS5uaXML+ghOWbd1O73y1yzaB/h9bk\ndE8lp3sqIzNTSWsV711gEWkyKvcwtmFHGfPyS+q+CkpYW7wXgIRYHyO6tuG643uTk5lKv46tWbFl\nN/PzdzCvYDsz56/nH18WANCjXdK3RZ/TPZWMNi0wMw8/lYg0BnPOHXqpJpCdne00/UDwnHN8U7yn\nbq88UOibdpUD0DohhpGZqYwM7JEP6pRMXMx3n06pqqllycZdzAvs4c/LL2F3eTUAHZMTvt2zz8lM\npVf7lip7kRBiZnnOuexDLqdyD03VNbUs31zKvIIS5uVvZ37BDkr2VgKQ1ir+2/LN6Z5K3/RW+HwN\nL+DaWseqraXM+88hnfwStpZWAJCaFEd2tzbkdE9lVPe29O/Yihi/zsOLeEXlHmbKq2pYvGEX8wvq\nCnZB4Q72VNTtTXdNTfyvMu/WNrFJ96adcxRuLwv8Yqnbuy/cXgZAUpyfrMxUcjLbkNO9LUMykkmI\n9TdZFhH5byr3ELenopoFhTu+PV6+aP1OKqtrAeib3oqR3evKMyczlQ7JCR6nhaLd5f93fD+/hJVF\npQDExfgYlpFSd9y+eypZ3drQMl6nckSaiso9xJRX1fDJquK64+UFJSzdtJuaWoffZwzqnPztnnB2\ntza0SYrzOu4h7SyrZH7Bjm//0liycRc1tQ6fwcBOyd+epO2S2gKj+Y/ZpyTG0imlRbNvV6SpqdxD\nzBXTc3l/WRHxMT6GdUlhVGBPd0TXNiRFwJ7u3opqFq7bybz87cwrKGHhup1UBP4S8crvzxzERaO7\neZpBpLEFW+7h3yphoGDbXt5fVsQVR3fn5pP7Eh8Teceok+JjOKp3O47q3Q6AiuoalmzcRXFppSd5\nZs5fx2/+uYSK6louP6q7JxlEvKRybwbPzi0kxmdccXSPiCz2g4mP8ZPVLdWz7Y/t157rX1jI795c\nRnlVDdcc18uzLCJe0Ji2JravsoZZuRs4eVAH2rf2/sRotIiL8fE/5w/nzGGduO/dldz/7kq8OgQp\n4oWgyt3MUszsJTNbYWbLzWzMAe+bmf3FzNaY2WIzG9E0ccPPG4s3sWtfFRfr2G+zi/H7+PPEYZw3\nsguPfLyG3/9ruQpeokawh2UeBt5xzp1jZnFA4gHvnwr0DnyNAv4e+B71ZswppE96S0Z19+4QRTTz\n+4y7fziYhFg/T36eT3lVDb+bMOiwLvoSCQeHLHczSwaOAS4DcM5VAgeeJZsATHd1u0VzAnv6HZ1z\nmxs5b1hZtH4nizfs4ncTBuoSfg/5fMbtZwwgPtbHY5+spbyqlj+dMwS/Cl4iWDB77t2BYuBpMxsK\n5AHXO+f27rdMZ2D9fs83BF6L6nJ/ZnYhSXF+zhze2esoUc/MuOWUfrSI9fPQB6upqK7hwUnDiNVU\nChKhgvmXHQOMAP7unBsO7AVuacjGzGyKmeWaWW5xcXFDVhE2SvZW8sbiTZw1IoNWmkM9JJgZN5zQ\nh1tO7cebizfzk2cXUFFd43UskSYRTLlvADY45+YGnr9EXdnvbyPQZb/nGYHX/otzbqpzLts5l52W\nltaQvGHjxdz1VFbXcvEYnUgNNVf9oCd3jh9Yd+3B9Dz2VargJfIcstydc1uA9WbWN/DS8cCyAxZ7\nHbgkMGpmNLArmo+319Q6ZswtZFT3VPqkt/I6jhzEpUdkcu/Zg/lsdTE/+sc89gYmaROJFMEecPwp\n8KyZLQaGAXeb2VVmdlXg/beAtcAa4HHgJ42eNIx8uqqY9SX7tNce4iaN7MpDk4Yxv2AHFz85l93l\nVV5HEmk0QQ2FdM4tAg6cy+DR/d53wDWNmCusPTOnkLRW8Zw0oIPXUeQQJgzrTJzfx3UvLOTCx+cy\n/cc5YTFxm8ihaKhAI1tfUsbHK7dyfk7X770bkoSOUwd35LGLs1hZVMr5j8+hOHCjEpFwpvZpZDPm\nFuIz44Kcrl5HkXoY2y+dpy8bSeH2MiZNnc2WwC0MRcKVyr0RlVfVMGv+ek4akB4SN9iQ+jmyVzum\n/TiHrbsrmPjYbNaXlHkdSaTBVO6N6F+LN7OjTPPIhLOc7qnMmDyKnWWVTHpsNvnb9h76h0RCkMq9\nEU2fU0jPtCTG9GzrdRQ5DMO6pPD8lNGUV9cy8bHZrArcUlAknKjcG8niDTv5av1OLh7dTfPIRICB\nnZKZOWU0Bpw3dQ5LNu7yOpJIvajcG8mMOYUkxvk5KyvD6yjSSHqnt2LWlWNoEevngsfnsHDdDq8j\niQRN5d4IdpZV8tqiTZw5vDOtNY9MRMlsl8TMK0eTkhjHRU/MZe7a7V5HEgmKyr0RvJS3gYrqWp1I\njVAZbRKZdeUYOiQncOnT8/h89TavI4kcksr9MNXWOp6ZU8jIzDb079ja6zjSRDokJzDzyjFktk3i\nx9Pm8+HyIq8jiXwvlfth+mzNNgq3l3GR9tojXruW8bwwZTT9OrTiymfyeOvrqJ0bT8KAyv0wPTO7\nkHYt4zh1UEevo0gzSEmMY8bkUQztksK1zy3g1YUbvI4kclAq98OwYUcZH60o4ryRmkcmmrROiGX6\nj3MY1b0tN836ihfmrfM6ksj/o0Y6DM/Nrfuf+vxRmkcm2iTFx/D0j0ZyTO80bnnla/7xRb7XkUT+\ni8q9gSqqa5g5fz0n9E+nc0oLr+OIBxJi/Uy9JIuTB6ZzxxvLePSTb7yOJPItlXsDvf31FrbvrdQN\nOaJcfIyfRy4Ywfihnbjn7RU8+P4q6m5vIOKtoG7WIf/f9NkF9GiXxJE923kdRTwW6/fx4KRhxMf4\nePjD1fjMuP6E3l7Hkiincm+AJRt3sWDdTn47bgA+n+aREfD7jHvPHkKtgwc/WEXP9kmMG9LJ61gS\nxXRYpgGenVtIQqyPc0ZoHhn5Pz6fcfdZgxiZ2YafzfqKxRt2eh1JopjKvZ527avinws3ceawziQn\nah4Z+W/xMX4evSiLtFbxXDE9V3d0Es+o3Ovp5bwN7Kuq0RWp8p3atozniUuz2VNezZRnctlXWeN1\nJIlCKvd6qK11zJhTyIiuKQzqnOx1HAlh/Tq05uHzhvP1xl38/KWvNIJGmp3KvR6+/GY7a7ft1fBH\nCcoJA9L55Sn9eHPxZv7y4Rqv40iU0WiZenhmTgGpSXGcNljzyEhwrjymB6uKSnnwg1X0Tm+pfzvS\nbLTnHqRNO/fx/rIiJo3sQnyM3+s4EibMjD+eNZisbm24adYi3a5Pmo3KPUjPz1uHAy7I0TwyUj/x\nMX4euziLtknxTJ6Wy9bdGkEjTU/lHoTK6lqen7ee4/u1p0tqotdxJAy1C4yg2V1exRXP5FFepRE0\n0rRU7kF4Z+kWtu2p0PBHOSz9O7bmoUnDWLxhJ794abFG0EiTCqrczazAzL42s0VmlnuQ9481s12B\n9xeZ2W2NH9U7z8wuoFvbRI7pneZ1FAlzJw3swM0n9eX1rzbx1481gkaaTn1GyxznnPu+OwN/5pwb\nd7iBQs3yzbuZX7CDX5/WX/PISKP4ybE9WbN1D/e/t4pe7Vtyiu7iJU1Ah2UOYcacQuJjfJybrXlk\npHH8ZwTN8K4p3DjzK5Zu0ggaaXzBlrsD3jOzPDOb8h3LjDGzr8zsbTMbeLAFzGyKmeWaWW5xcXGD\nAjen3eVVvLpwI+OHdiIlMc7rOBJBEmLrRtC0SYzlimm5bC3VCBppXMGW+1HOuRHAqcA1ZnbMAe8v\nALo554YC/wP882Arcc5Ndc5lO+ey09JC//j1qws2UlZZoytSpUm0b5XA45dms6Osiis1gkYaWVDl\n7pzbGPi+FXgVyDng/d3OuT2Bx28BsWYW1nexcM7xzJxChnZJYUhGitdxJEIN7JTMg5OGsnDdTm59\n5WuNoJFGc8hyN7MkM2v1n8fAScCSA5bpYGYWeJwTWO/2xo/bfGav3c6arXu4WMMfpYmdMqgjN5/U\nh1cXbuTvug+rNJJgRsukA68GujsGeM45946ZXQXgnHsUOAe42syqgX3AeS7Md0FmzCkkJTGWcUM0\nkkGa3jXH9WJV0R7ue3clvdJactLADl5HkjB3yHJ3zq0Fhh7k9Uf3e/wI8EjjRvPOll3lvLu0iMlH\ndSchVvPISNMzM/50zhAKt+/lhpmLeOmqIxjQqbXXsSSMaSjkQTw/bx21znHhKB2SkeaTEOvn8Uuy\naZ0QyxXTcykurfA6koQxlfsBqmpqeX7eOo7tk0bXtppHRppX+9YJPHFpNtv3VnDVjDwqqjWCRhpG\n5X6A95YWsbW0QsMfxTODOifzwMRh5BXu4FevLNEIGmkQlfsBps8uoEtqC37Qp73XUSSKnTa4Izee\n0IeXF2xg6qdrvY4jYUjlvp9VRaXMzS/hwlHd8GseGfHYdcf3YtyQjtzzzgo+WFbkdRwJMyr3/cyY\nU0hcjI+J2V28jiKCmXHfOUMZ3DmZ619YyIotu72OJGFE5R6wp6KaVxZsZNyQjqQmaR4ZCQ0t4vxM\nvTibpPgYJk/LZfsejaCR4KjcA15duJE9FdW6IlVCTofkBB6/JJviUo2gkeCp3AnMIzO7gMGdkxnW\nRfPISOgZ2iWF+88dyvyCHfzmVY2gkUNTuQPz8ktYVVQ3j0xgmgWRkHPG0E5cd3xvXszbwJOf53sd\nR0Jcfe7EFLGmzykkuUUsZwzt5HUUke91w/G9WbO1lLvfWk7PtJYc109DduXgon7Pfevuct5dsoVz\nszJoEad5ZCS0+XzGn88dxoBOrfnp8wtZVVTqdSQJUVFf7i/MX091reNCnUiVMNEirm4OmhZxfi6f\nNp+SvZVeR5IQFNXlXl1Ty3Nz13FMnzS6t0vyOo5I0Domt2DqxVkU7a4bQVNZXet1JAkxUV3uHywv\nYsvucg1/lLA0vGsb7jtnCPPyS7jtNY2gkf8W1SdUp88upHNKC8bqpJSEqQnDOrO6aA+PfLyG3umt\nuPyo7l5HkhARtXvua7aW8uU327lgVFfNIyNh7aYT+3DKwA784V/L+PfKrV7HkRARteU+Y8464vw+\nJo3UPDIS3nw+44FJQ+nboTU/fW4ha7ZqBI1Eabnvrajm5bwNnDa4A+1axnsdR+SwJcbF8MSl2cTH\n+rl8Wi47NIIm6kVlub+2aBOlFdW6IYdElM4pLXjs4iw27yzn6mfzqKrRCJpoFnXl7pxj+uwCBnRs\nzYiubbyOI9Kosrq14d5zBjNnbQm3v75UI2iiWNSVe17hDlZsKeXiMZpHRiLTD4dncPWxPXlu7jqm\nfVngdRzxSNSV+/TZhbRKiGHCMM0jI5Hr5yf15cQB6dz15jI+XVXsdRzxQFSV+8ad+3h7yWbOycog\nMS6qh/hLhPP5jIcmDaNPeiuueW4B3xTv8TqSNLOoKve7/7Ucv8+YfHQPr6OINLmk+LoRNHF+H5On\n5bKzTCNooknUlPvnq7fxr683c82xveic0sLrOCLNIqNNIo9dnMXGHfu45rkFGkETRaKi3Cura7n9\n9SV0TU3kimO01y7RJTszlbvPGswXa7Zz1xvLvI4jzSSoA89mVgCUAjVAtXMu+4D3DXgYOA0oAy5z\nzi1o3KgNN+3LAr4p3suTl2aTEKs52yX6nJOVweqiUh77dC190lty8ZhMryNJE6vPWcXjnHPbvuO9\nU4Hega9RwN8D3z23dXc5D32wirH92nN8/3Sv44h45hen9GPN1j3c8cYyeqS15Mhe7byOJE2osQ7L\nTACmuzpzgBQz69hI6z4sf3x7BVU1jtvGDfA6ioin/D7j4fOH0yutJT95dgH52/Z6HUmaULDl7oD3\nzCzPzKYc5P3OwPr9nm8IvOapefklvLpwI1OO6UGmbsYhQsvACBq/z7h82nx27avyOpI0kWDL/Sjn\n3AjqDr9cY2bHNGRjZjbFzHLNLLe4uGkvrKiuqeW215bQKTmBnxzXs0m3JRJOuqQm8uhFWawvKePa\n5xZQrRE0ESmocnfObQx83wq8CuQcsMhGYP+5czMCrx24nqnOuWznXHZaWlrDEgfpuXnrWLGllN+M\nG6ALlkQOkNM9lT+cOZjPVm/j9/9a7nUcaQKHLHczSzKzVv95DJwELDlgsdeBS6zOaGCXc25zo6cN\n0vY9Fdz/7kqO6tWOUwd18CqGSEibOLILk4/qzj++LODZuYVex5FGFswubTrwamCSrRjgOefcO2Z2\nFYBz7lHgLeqGQa6hbijkj5ombnDue3clZZU13DF+gCYHE/ket57Wn2+K93D7a0vp3i6JI3pqBE2k\nMK+mBM3Ozna5ubmNvt5F63fyw799wRVH9+BXp/Vv9PWLRJrS8irO+tuXFO+p4J8/OVKDD0KcmeUd\neK3RwUTUFaq1tY7bX1tCWst4fjq2l9dxRMJCq4RYnrx0JAZMnp7L7nKNoIkEEVXuL+at56sNu/jV\naf1plRDrdRyRsNG1bSJ/uzCLgm17+elzCzWCJgJETLnvKqvi3ndWMjKzjeZqF2mAMT3b8rszB/HJ\nqmLufmuF13HkMEXMGMEH3l/JzrJK7hw/SidRRRro/JyurCoq5akv8umT3pLzcrp6HUkaKCL23Jdt\n2s0zcwq5eHQ3BnRq7XUckbD269P6c0yfNH7zzyXMWbvd6zjSQGFf7s45bn99CSmJcdx0Yl+v44iE\nvRi/j0cuGE63tolcPSOPddvLvI4kDRD25f7aok3ML9jBL0/pS3KiTqKKNIbWgRE0tQ4unzafUo2g\nCTthXe6l5VX84a3lDM1I5tysLof+AREJWma7JP5+4Qjyt+3luucXUlPrzTUx0jBhXe7/89Eatu2p\n4K4Jg/D5dBJVpLEd0asdd4wfyMcri7nnbc1BE07CdrTMmq2lPPV5PpOyuzC0S4rXcUQi1kWju7G6\nqJTHP8und3orJmbrr+RwEJZ77s457nh9GYlxfn5+sk6iijS1344bwNG92/HrV79mXn6J13EkCGFZ\n7u8s2cLna7Zx88l9adsy3us4IhEvxu/jkfNH0KVNIlfNyGN9iUbQhLqwK/d9lTX87s1l9O/Ymgt0\ngYVIs0lOjOWJS7Oprqll8rRc9lRUex1JvkfYlftfP17Dpl3l3DVhIDH+sIsvEtZ6pLXkbxdmsaZ4\nDze8oBE0oSys2rFg216mfrqWHw7vzMjMVK/jiESlo3q34/YzBvDB8q386V3NQROqwmq0zF1vLiPW\nb9x6aj+vo4hEtUvGZLKqqJTHPllL7/atOCcrw+tIcoCw2XP/cHkRH63Yyg0n9KF96wSv44hEvdvP\nGMiRvdryq1e+ZsG6HV7HkQOERbmXV9Vw5xvL6NW+JZcdmel1HBEBYv0+/nrBCNq3jueGFxaxVydY\nQ0pYlPvjn65lXUkZd5wxkFidRBUJGSmJcTwwcRjrd5Tx+3/pCtZQEvJNuWFHGX/99xpOG9yBo3rr\n5r0ioSaneypTjunB8/PW8eHyIq/jSEDIl/sfAnsDvz59gMdJROS73HRiH/p1aMUvX/6a7XsqvI4j\nhHi5f7a6mLeXbOHa43rROaWF13FE5DvEx/h56Lxh7N5Xxa9e/RrnNP7dayFb7pXVtdzx+lK6tU1k\n8tE9vI4jIofQr0NrfnZSH95dWsTLCzZ6HSfqhWy5/+PLfL4p3ssdZwwkIdbvdRwRCcLko3uQ0z2V\nO15fqvlnPBaS5V60u5yHP1jNCf3bc1y/9l7HEZEg+X3Gn88dCsDNL35FraYn8ExIlvsf31pOVa3j\nt+N0ElUk3HRJTeS2MwYwN7+EJz/P9zpO1Aq5cp+7djv/XLSJq47pQbe2SV7HEZEGODcrg5MGpHPf\nuytZuaXU6zhRKaTKvbqmlttfX0rnlBZcfWwvr+OISAOZGX88azCtW8Rww8xFVFTXeB0p6gRd7mbm\nN7OFZvbmQd67zMyKzWxR4GtyQ8I8O3cdK7aU8ttx/WkRp5OoIuGsbct47jlrCMs37+ahD1Z7HSfq\n1GfP/Xrg+64vnumcGxb4eqK+QbbtqeDP763k6N7tOHlgh/r+uIiEoBMGpHPeyC48+sk3zC/Q7fma\nU1DlbmYZwOlAvUs7WPe9s5KyyhpuP2MgZtZUmxGRZvabcQPIaNOCm2Yt0t2bmlGwe+4PAb8Aar9n\nmbPNbLGZvWRm9bo9+qL1O5mZu57Lj+pOr/Yt6/OjIhLiWsbH8ODEYWzcsY/fv7nM6zhR45Dlbmbj\ngK3OubzvWewNINM5NwR4H5j2HeuaYma5ZpZbXFwMQG2t47bXltC+VTw/Pb53/T+BiIS87MxUrvxB\nT16Yv573l2lyseYQzJ77kcB4MysAXgDGmtmM/Rdwzm13zv1ntqAngKyDrcg5N9U5l+2cy05LSwNg\nVu56Fm/Yxa9P70/L+LC6MZSI1MONJ/Shf8fW3PrKYrZpcrEmd8hyd87d6pzLcM5lAucBHznnLtp/\nGTPruN/T8Xz/iddv7Syr5N53VpCTmcr4oZ3qEVtEwk1cjI+HJg1j975qbn1Fk4s1tQaPczezu8xs\nfODpdWa21My+Aq4DLgtmHQ+8v4pd+6q4c4JOoopEg74dWvHzk/vy/rIiXszb4HWciGZe/fYcOHS4\n23fa77lkTCZ3jB/oSQYRaX61tY7zH5/D0k27efv6o+mSmuh1pLBiZnnOuexDLefZFaqbdu6jTWIc\nN57Yx6sIIuIBn8/488S6ycV+NusrajS5WJPwrNzLKmv45Sn9SG4R61UEEfFIRptE7hg/kHkFJTzx\n2Vqv40Qkz8o9KS6Gc7IyvNq8iHjs7BGdOWVgB/783iqWb97tdZyI41m590hLwufTSVSRaGVm3H3W\nYFq3iOVGTS7W6EJqVkgRiS6pSXH86ZzBrNhSygPvr/I6TkRRuYuIp8b2S+f8nK5M/XQtc9du9zpO\nxFC5i4jnfnN6f7qmJvKzF7+itLzK6zgRQeUuIp5Lio/hgYlD2bRzH7/T5GKNQuUuIiEhq1sqVx/b\nk1m5G3hv6Rav44Q9lbuIhIzrj+/DwE6tufWVryku1eRih0PlLiIhIy7Gx4OThlFaUc2tryzW5GKH\nQeUuIiGlT3orfnFyXz5YvpVZueu9jhO2VO4iEnJ+fGR3xvRoy11vLGPd9jKv44QllbuIhByfz7h/\n4lB8Ztw0a5EmF2sAlbuIhKTOKS24c8JAcgt3MPVTTS5WXyp3EQlZPxzemVMHdeCB91eydNMur+OE\nFZW7iIQsM+MPPxxMSmIcN838ivIqTS4WLJW7iIS0usnFhrCySJOL1YfKXURC3nF923PhqK48/tla\n5mhysaCo3EUkLPz69P50S03kZ7O+YrcmFzsklbuIhIXEuBgemDSMzbv2cefrmlzsUFTuIhI2RnRt\nwzXH9eLlBRt4Z4kmF/s+KncRCSvXHd+bQZ1b86tXv2ZrabnXcUKWyl1Ewkqs38eDE4exp6Kaq2cs\nYMlGjX8/GJW7iISd3umtuPfswazaUsq4//mcHz09j7zCEq9jhRTzakrN7Oxsl5ub68m2RSQy7NpX\nxfQvC3jqi3x2lFUxpkdbrh3biyN6tsXMvI7XJMwszzmXfcjlVO4iEu72VlTz/Lx1TP10LVtLKxjW\nJYVrj+vF8f3bR1zJq9xFJOqUV9XwUt4GHv3kGzbs2Ee/Dq245rhenDa4I35fZJR8sOUe9DF3M/Ob\n2UIze/Mg78Wb2UwzW2Nmc80ss35xRUQOX0Ksn4tGd+Pjm4/lz+cOpaqmlp8+v5ATH/iEF3PXU1VT\n63XEZlOfE6rXA8u/473LgR3OuV7Ag8C9hxtMRKShYv0+zs7K4L0bf8BfLxhBfKyfn7+0mGPv+zfP\nzC6IignIgip3M8sATgee+I5FJgDTAo9fAo63SDvQJSJhx+8zTh/SkbeuO4qnLssmvXU8v31tKUf/\n6WMe/3QteyuqvY7YZILdc38I+AXwXX/TdAbWAzjnqoFdQNsDFzKzKWaWa2a5xcXFDYgrIlJ/ZsbY\nfum8fPURPHfFKHq3b8kf3lrOkfd+xF8+XM2ufZE3V80hy93MxgFbnXN5h7sx59xU51y2cy47LS3t\ncFcnIlIvZsYRPdvx3BWjeeUnR5DVtQ0PvL+KI+/5iHvfWcG2PRVeR2w0wey5HwmMN7MC4AVgrJnN\nOGCZjUAXADOLAZIBzcspIiFrRNc2PHnZSN667mh+0DeNRz/5hqPu/Yg731jK5l37vI532Oo1FNLM\njgVuds6NO+D1a4DBzrmrzOw84Czn3MTvW5eGQopIKPmmeA9///c3/HPhRszgnKwMrvpBT7q1TfI6\n2n9p9KGQB9nAXWY2PvD0SaCtma0BbgJuaeh6RUS80DOtJfefO5SPbz6WSSO78PKCjRx3/7+54YWF\nrC4q9TpevekiJhGRg9i6u5zHP1vLs3PXUVZZwykDO3Dt2F4M6pzsaS5doSoi0gh27K3k6S/yefrL\nAkrLq/lBnzSuHduLkZmpnuRp8sMyIiLRoE1SHDed1JcvbhnLz0/uy5KNuzj30dk88N5KvNo5DobK\nXUQkCK0TYrnmuF58/suxTMzO4C8freHut5aHbMHHeB1ARCSctIjzc89ZQ2gR6+fxz/Ipr6rlzvED\n8YXYxGQqdxGRevL5jDvGDyQh1s9jn66lvKqGe84eElIzT6rcRUQawMy45dR+JMT6efjD1ZRX1/LA\nxKHE+kPjaLfKXUSkgcyMG09bfL6QAAAHHUlEQVTsQ0Ksn3vfWUFldQ1/OX848TF+r6PphKqIyOG6\n+tie3HHGAN5dWsSVz+SFxJTCKncRkUZw2ZHd+eNZg/lkVTE/enq+59MJq9xFRBrJ+TldeWDiUObm\nb+fSp+axu9y7qYRV7iIijeiHwzN45IIRLFq/k4uemMvOskpPcqjcRUQa2WmDOzL1kixWbCnlvKlz\nPJknXuUuItIExvZL56lLR1KwfS+THptN0e7yZt2+yl1EpIkc1bsd0388ii27ypn42Gw27Chrtm2r\n3EVEmlBO91RmTB7Fjr2VTHx0NgXb9jbLdlXuIiJNbHjXNjw/ZTTl1bVMfGx2s9z8Q+UuItIMBnZK\n5oUpo3HApKlzWLppV5NuT+UuItJM+qS3YtaVY0iI8XH+1DksWr+zybalchcRaUbd2yUx88oxJCfG\nctETc5lfUNIk21G5i4g0sy6pibx45RG0bx3PJU/O44s12xp9Gyp3EREPdEhOYOaUMXRrm8iP/jGf\nj1dsbdT1q9xFRDyS1iqe568YTZ/0lkx5Jpd3lmxutHWr3EVEPNQmKY5nJ49mcOdkrnluIa8t2tgo\n61W5i4h4LLlFLM9cPoqRmW24YeYiZs1ff9jrVLmLiISApPgYnr4sh6N7p/GLlxczfXbBYa1P5S4i\nEiJaxPl5/JIsThyQzm2vLWXqp980eF0qdxGREBIf4+dvF45g3JCO3P3WCh7+YDXOuXqv55DlbmYJ\nZjbPzL4ys6VmdudBlrnMzIrNbFHga3K9k4iICACxfh8Pnzecc7IyePCDVdz7zsp6F3xMEMtUAGOd\nc3vMLBb43Mzeds7NOWC5mc65a+u1dREROSi/z/jT2UNIiPXx6CffUF5Vw23jBgT984csd1f362JP\n4Gls4Kv+fyOIiEi9+HzG7yYMIj7Gz5Of51NeVRP0zwaz546Z+YE8oBfwV+fc3IMsdraZHQOsAm50\nzh3+WB4RkShnZvzm9P60iPXzyMdrgv65oE6oOudqnHPDgAwgx8wGHbDIG0Cmc24I8D4w7TtCTjGz\nXDPLLS4uDjqkiEg0MzNuPrkvN5/UJ/ifqe9BejO7DShzzt3/He/7gRLnXPL3rSc7O9vl5ubWa9si\nItHOzPKcc9mHWi6Y0TJpZpYSeNwCOBFYccAyHfd7Oh5YXr+4IiLSmII55t4RmBbYI/cBs5xzb5rZ\nXUCuc+514DozGw9UAyXAZU0VWEREDq3eh2Uaiw7LiIjUX6MdlhERkfCjchcRiUAqdxGRCKRyFxGJ\nQCp3EZEI5NloGTMrBVZ6snHvtAMa/zbnoU2fOTroMzefbs65tEMtFNTcMk1kZTDDeSKJmeXqM0c+\nfeboEOqfWYdlREQikMpdRCQCeVnuUz3ctlf0maODPnN0COnP7NkJVRERaTo6LCMiEoE8KXczO8XM\nVprZGjO7xYsMzcnMupjZx2a2LHCT8eu9ztQczMxvZgvN7E2vszQHM0sxs5fMbIWZLTezMV5nampm\ndmPg3/QSM3vezBK8ztTYzOwpM9tqZkv2ey3VzN43s9WB7228zHgwzV7ugamD/wqcCgwAzjez4O/6\nGp6qgZ855wYAo4FrouAzA1xPdM3t/zDwjnOuHzCUCP/sZtYZuA7Ids4NAvzAed6mahL/AE454LVb\ngA+dc72BDwPPQ4oXe+45wBrn3FrnXCXwAjDBgxzNxjm32Tm3IPC4lLr/6Tt7m6ppmVkGcDrwhNdZ\nmoOZJQPHAE8COOcqnXM7vU3VLGKAFmYWAyQCmzzO0+icc59Sd5+K/U3g/24nOg04s1lDBcGLcu8M\n7H/z7A1EeNHtz8wygeHAwW4yHkkeAn4B1HodpJl0B4qBpwOHop4wsySvQzUl59xG4H5gHbAZ2OWc\ne8/bVM0m3Tm3OfB4C5DuZZiD0QnVZmRmLYGXgRucc7u9ztNUzGwcsNU5l+d1lmYUA4wA/u6cGw7s\nJQT/VG9MgePME6j7xdYJSDKzi7xN1fxc3ZDDkBt26EW5bwS67Pc8I/BaRDOzWOqK/Vnn3Cte52li\nRwLjzayAusNuY81shreRmtwGYINz7j9/kb1EXdlHshOAfOdcsXOuCngFOMLjTM2l6D/3jg583+px\nnv/Hi3KfD/Q2s+5mFkfdCZjXPcjRbMzMqDsWu9w594DXeZqac+5W51yGcy6Tuv++HznnInqPzjm3\nBVhvZn0DLx0PLPMwUnNYB4w2s8TAv/HjifCTyPt5Hbg08PhS4DUPsxxUs08c5pyrNrNrgXepO7v+\nlHNuaXPnaGZHAhcDX5vZosBrv3LOveVhJml8PwWeDey0rAV+5HGeJuWcm2tmLwELqBsRtpAQv2qz\nIczseeBYoJ2ZbQBuB+4BZpnZ5UAhMNG7hAenK1RFRCKQTqiKiEQglbuISARSuYuIRCCVu4hIBFK5\ni4hEIJW7iEgEUrmLiEQglbuISAT6X/s00irHqiZcAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "temperatures = [4.4,5.1,6.1,6.2,6.1,6.1,5.7,5.2,4.7,4.1,3.9,3.5]\n", "s7 = pd.Series(temperatures, name=\"Temperature\")\n", "s7.plot()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are *many* options for plotting your data. It is not necessary to list them all here: if you need a particular type of plot (histograms, pie charts, etc.), just look for it in the excellent [Visualization](http://pandas.pydata.org/pandas-docs/stable/visualization.html) section of pandas' documentation, and look at the example code." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Handling time\n", "Many datasets have timestamps, and pandas is awesome at manipulating such data:\n", "* it can represent periods (such as 2016Q3) and frequencies (such as \"monthly\"),\n", "* it can convert periods to actual timestamps, and *vice versa*,\n", "* it can resample data and aggregate values any way you like,\n", "* it can handle timezones.\n", "\n", "## Time range\n", "Let's start by creating a time series using `pd.date_range()`. This returns a `DatetimeIndex` containing one datetime per hour for 12 hours starting on October 29th 2016 at 5:30pm." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DatetimeIndex(['2016-10-29 17:30:00', '2016-10-29 18:30:00',\n", " '2016-10-29 19:30:00', '2016-10-29 20:30:00',\n", " '2016-10-29 21:30:00', '2016-10-29 22:30:00',\n", " '2016-10-29 23:30:00', '2016-10-30 00:30:00',\n", " '2016-10-30 01:30:00', '2016-10-30 02:30:00',\n", " '2016-10-30 03:30:00', '2016-10-30 04:30:00'],\n", " dtype='datetime64[ns]', freq='H')" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dates = pd.date_range('2016/10/29 5:30pm', periods=12, freq='H')\n", "dates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This `DatetimeIndex` may be used as an index in a `Series`:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 17:30:00 4.4\n", "2016-10-29 18:30:00 5.1\n", "2016-10-29 19:30:00 6.1\n", "2016-10-29 20:30:00 6.2\n", "2016-10-29 21:30:00 6.1\n", "2016-10-29 22:30:00 6.1\n", "2016-10-29 23:30:00 5.7\n", "2016-10-30 00:30:00 5.2\n", "2016-10-30 01:30:00 4.7\n", "2016-10-30 02:30:00 4.1\n", "2016-10-30 03:30:00 3.9\n", "2016-10-30 04:30:00 3.5\n", "Freq: H, dtype: float64" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series = pd.Series(temperatures, dates)\n", "temp_series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot this series:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAFbCAYAAAD1FWSRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XmYXHWd7/H3BwIaiYbVRsMSRwRl\nzACmkZnLOKbdCILiODMu88y4YrzXK6OO9wquI+NGdPDCqKCM4HJdWkbRYUDABYLDVZZOWAKGuAYh\njwlKWAyiGP3eP87pTnVT1V3dOed3+nf683qeetJ1TnV9zvdXXd9UnTrnV4oIzMwsHzs1vQFmZjY9\nbtxmZplx4zYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM/PquNO99947Fi9ePO3f\nu//++9ltt92q36CGs5znPOfNnbyZZq1evfqXEbFPXzeOiMovS5cujZm44oorZvR7sz3Lec5z3tzJ\nm2kWMBJ99ljvKjEzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZqeXMSUtj\n8SkX91z35iXbeEWP9RtOO66uTTKzBPyK28wsM27cZmaZceM2M8tMX/u4Je0OfBJ4MhDAqyLie3Vu\nWI7avs85dX1tH0+zmer3w8kzgUsj4q8l7Qo8osZtMjOzSUzZuCUtBP4CeAVARDwIPFjvZpmZWS8q\npoGd5AbS4cA5wPeBw4DVwBsi4v4Jt1sBrAAYGBhYOjw8PO2N2bp1KwsWLJj2781EHVlrN97bc93A\nfNj8QPd1SxYtdN4syJtMyr9N5+WdN9OsoaGh1REx2M9t+2ncg8DVwNERcY2kM4H7IuKdvX5ncHAw\nRkZGprPNAKxatYply5ZN+/dmoo6sqfbJnr62+xucuvYBO686Kf82nZd33kyzJPXduPs5quQO4I6I\nuKa8/mXgKdPeKjMzq8SUjTsiNgG3SzqkXPRMit0mZmbWgH6PKjkJ+Hx5RMlPgFfWt0lmZjaZvhp3\nRNwA9LXvxczM6uUzJ83MMuPGbWaWGTduM7PMuHGbmWXGX6RgVvKkVpYLv+I2M8uMG7eZWWbcuM3M\nMuPGbWaWGTduM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzLhx\nm5llxo3bzCwzbtxmZpnxFymYNcRf3GAz5VfcZmaZceM2M8tMX7tKJG0AfgX8HtgWEYN1blRV/FbU\nzNpoOvu4hyLil7VtiZmZ9cW7SszMMqOImPpG0k+Bu4EAPhER53S5zQpgBcDAwMDS4eHhaW/M1q1b\nWbBgwbR/r5e1G+/tuW5gPmx+oPu6JYsWOs95rcubTNXPvbmcN9OsoaGh1f3uhu63cS+KiI2SHg18\nEzgpIr7T6/aDg4MxMjLS9waPWrVqFcuWLZv27/Uy1T7u09d231M0033cznPebM6bTNXPvbmcN9Ms\nSX037r52lUTExvLfO4GvAk+d9laZmVklpmzcknaT9MjRn4HnADfXvWFmZtZdP0eVDABflTR6+y9E\nxKW1bpWZmfU0ZeOOiJ8AhyXYFjMz64MPBzQzy4wbt5lZZty4zcwy48ZtZpYZz8dtNkd40rX28Ctu\nM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzCQ/c9Jnb5mZ7Ri/\n4jYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8uMv3PSzGrhs6Tr\n0/crbkk7S7pe0kV1bpCZmU1uOrtK3gCsq2tDzMysP301bkn7AccBn6x3c8zMbCqKiKlvJH0Z+ADw\nSOB/RcTxXW6zAlgBMDAwsHR4eLjrfa3deG/PnIH5sPmB7uuWLFo45XY2meU85zmv2bzJbN26lQUL\nFlR+v1VmDQ0NrY6IwX5uO2XjlnQ88NyIeJ2kZfRo3J0GBwdjZGSk67qpPrA4fW33z0tn8oFFyizn\nOc95zeZNZtWqVSxbtqzy+60yS1LfjbufXSVHA8+XtAEYBp4h6XPT3iozM6vElIcDRsRbgbcCdLzi\n/ruat8vMbFpmcvhhroce+gQcM7PMTOsEnIhYBayqZUvMzKwvfsVtZpYZN24zs8y4cZuZZcaN28ws\nM27cZmaZceM2M8uMG7eZWWbcuM3MMuPGbWaWGTduM7PMuHGbmWXGXxZsZjZNTX8Rsl9xm5llxo3b\nzCwzbtxmZplx4zYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8vM\nlI1b0sMlXSvpRkm3SDo1xYaZmVl3/cwO+FvgGRGxVdIuwFWSLomIq2veNjMz62LKxh0RAWwtr+5S\nXqLOjTIzs95U9OUpbiTtDKwGDgI+FhEnd7nNCmAFwMDAwNLh4eGu97V24709cwbmw+YHuq9bsmjh\nlNvZZJbznOe8/PJmU21DQ0OrI2Kwn/y+GvfYjaXdga8CJ0XEzb1uNzg4GCMjI13XTTUB+elru78J\nmMkE5CmznOc85+WXN5tqk9R3457WUSURcQ9wBbB8Or9nZmbV6eeokn3KV9pImg88G7i17g0zM7Pu\n+jmq5DHAZ8r93DsB50fERfVulpmZ9dLPUSU3AUck2BYzM+uDz5w0M8uMG7eZWWbcuM3MMuPGbWaW\nGTduM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzLhxm5llxo3b\nzCwzbtxmZplx4zYzy4wbt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZmbJxS9pf\n0hWSvi/pFklvSLFhZmbW3bw+brMNeHNErJH0SGC1pG9GxPdr3jYzM+tiylfcEfHziFhT/vwrYB2w\nqO4NMzOz7qa1j1vSYuAI4Jo6NsbMzKamiOjvhtIC4ErgfRFxQZf1K4AVAAMDA0uHh4e73s/ajff2\nzBiYD5sf6L5uyaKFfW1nU1nOc57z8subTbUNDQ2tjojBfvL7atySdgEuAi6LiA9PdfvBwcEYGRnp\num7xKRf3/L03L9nG6Wu773bfcNpxU25nk1nOc57z8subTbVJ6rtx93NUiYBzgXX9NG0zM6tXP/u4\njwb+HniGpBvKy3Nr3i4zM+thysMBI+IqQAm2xczM+uAzJ83MMuPGbWaWGTduM7PMuHGbmWXGjdvM\nLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzLhxm5llxo3bzCwzbtxmZplx4zYzy4wb\nt5lZZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8uMG7eZWWbcuM3MMjNl45Z0\nnqQ7Jd2cYoPMzGxy/bzi/jSwvObtMDOzPk3ZuCPiO8CWBNtiZmZ9UERMfSNpMXBRRDx5ktusAFYA\nDAwMLB0eHu56u7Ub7+2ZMzAfNj/Qfd2SRQun3M4ms5znPOfllzebahsaGlodEYP95FfWuDsNDg7G\nyMhI13WLT7m45++9eck2Tl87r+u6Dacd1090Y1nOc57z8subTbVJ6rtx+6gSM7PMuHGbmWWmn8MB\nvwh8DzhE0h2SXl3/ZpmZWS/dd8R0iIiXptgQMzPrj3eVmJllxo3bzCwzbtxmZplx4zYzy4wbt5lZ\nZty4zcwy48ZtZpYZN24zs8y4cZuZZcaN28wsM27cZmaZceM2M8uMG7eZWWbcuM3MMuPGbWaWGTdu\nM7PMuHGbmWXGjdvMLDNu3GZmmXHjNjPLjBu3mVlm3LjNzDLjxm1mlhk3bjOzzPTVuCUtl7Re0o8k\nnVL3RpmZWW9TNm5JOwMfA44FDgVeKunQujfMzMy66+cV91OBH0XETyLiQWAYOKHezTIzs14UEZPf\nQPprYHlEnFhe/3vgqIh4/YTbrQBWlFcPAdbPYHv2Bn45g9+biZRZznOe8+ZO3kyzDoyIffq54bwZ\n3HlXEXEOcM6O3IekkYgYrGiTZk2W85znvLmTlyKrn10lG4H9O67vVy4zM7MG9NO4rwOeIOlxknYF\nXgJcWO9mmZlZL1PuKomIbZJeD1wG7AycFxG31LQ9O7SrZRZnOc95zps7ebVnTfnhpJmZzS4+c9LM\nLDNu3GZmmXHjNjPLTGXHcc9mkp5IcbbnonLRRuDCiFjX3FZVp+31pdbEeEoa6MyLiM01ZonijOjO\n+q6Nmj7wSp1XZqYcz2RZY5lNfTgp6RjgBYx/MP8jIi6tOOdk4KUUp+rfUS7ej+KwxuGIOK3KvDIz\nSW1lVqvrS52XejwlHQ58HFjI9vMj9gPuAV4XEWsqznsOcBbwwwl5B5V538g8L9l4pn7sxmU30bgl\nnQEcDHyW8U+OlwE/jIg3VJj1A+CPI+J3E5bvCtwSEU+oKqu832S1lXltr6/t43kD8NqIuGbC8j8F\nPhERh1Wctw44NiI2TFj+OODrEfGkzPOSjWfqx26ciEh+AX7QY7konoxVZt1KMQfAxOUHAutzrm2O\n1Nf28exZA8XkbpXnAfO6LN+1LXmpxjP1Y9d5aWof928kHRkR101YfiTwm4qz3gh8W9IPgdvLZQdQ\nvFV7fc/fmrmUtUH762v7eF4i6WKKdxSjeftTvKOoY9fTecB1koYn5L0EOLcFeSnHM/VjN6apXSVP\nAc4GHsn2t7/7A/cC/zMiVlectxMP/XDkuoj4fZU5ZVbS2srM1tbX9vEs846l+4ehX68p71Dg+V3y\nvt+SvGTjmfqxG8ttonGPhUv7Mv7T2E015TTxqXaS2sqsVteXOq+J8WyCpD0BImJLG/ParLHDASUt\nBJ5Ox5ND0mURcU/FOT0/1ZZU+afaZWaS2sqsVteXOi/1eJa1vZXiVdsAEMCdwH8Ap9XwfDgA+CDw\nDIp3LZL0KOBy4JSY8CFihnnJxjP1Y9epkRNwJL0MWAMsAx5RXoaA1eW6Kp0JPCsijo2IE8vLcuDZ\n5bpKJa4NWl5f28cTOB+4GxiKiD0jYi+K+u4p11XtS8BXgcdExBMi4iDgMcDXKA6BzD0v5Ximfuy2\nq/OTz0k+cV0P7N5l+R70OIpgB7JSf6qdrLY5Ul/rx3Mm63akvpmsyygv2Ximfuw6L03tKhHF24qJ\n/lCuq1LqT7VT1gbtr6/t43mbpLcAn4nyjLvyTLxXdORXabWks4DPML6+lwPXtyAv5XimfuzGNHVU\nycuBdwHfYPwhV88G3hMRn64470l0/+S38k+1U9dWZra2vjkwnnsApzB+P+lmii8rWRkVf5BXnkj0\nasbXdwfwn8C5EfHbzPOSjWfqx25cdhONG8aKPobxT47LIuLuRjaoQm2uDdLX1/bxNJu2OvfDzLYL\n8O7Jrud+aXt9bR9P4CmTXa8h7/jJrrcgL9l4pn7sGp/WVdI5k12v2MSTNSo/eaNT4tqg5fW1fTyB\n/zHF9aodOcX13PNSjmfSx67xry6TtDQ6zn6beD1nba4N0tfX9vE061fjjbtukuZRfDjyl8Bjy8Ub\nKQ6SPzcmzAKXm7bXl1oT41meyLGch+7Dr+uEpqTzjTeQl2w8Uz92o5o6AWehpNMk3Sppi6S7JK0r\nl+1ecdz/BQ4H3g08t7ycChwGfK7irNS1QcvrmwPjmfqEppMpTnwRcG15EfBFSae0IC/ZeDZwcth2\nde5An2RH/mXAycC+Hcv2LZd9o+KsnidpTLYuh9rmSH1tH8/UJxj9ANily/JdqWla3sR5KU/uS/rY\ndV6a+nBycUSsjI6JgiJiU0SspJj3uEpbJP2NihnfgGL2N0kvpjhdtWopa4P219f28Ux9gtEf2L4L\nqNNjynW556Ucz9SP3ZimzpxMecbRS4CVwFmS7qYY0N0pJrl5ScVZkP5sqrbX1/bxfB+wRlLXE4xq\nyEs933jqvJTjmfqxG9PUmZOdZxw9ulxc+xlHkvYCiIi76rj/MqOR2srs1tXX9vEsc1Kf0JR6vvHU\necnGs6mTw1p/VAmApKcCERHXqZjUfTmwLiIuaXjTKtH2+lJrejwl7Vnzf0g7AUTEH8pT0p8MbKjx\nBVPSvC75tY5nE1lNHVUiSS8q9yVK0jMl/auk13XuW6wo65+AfwXOlvQB4KPAbsBbJb29yqwyL1lt\nZV7b62v7eB6t4iiZWyQdJembFJNc3S7pz2rIewHwc4o5zU8A/gv4EHCTpOe1IC/ZeKZ+7MZlN7Sr\n5CyKt727AvcBD6N463scsDmq/Zb3tRSHdz0M2ATsFxH3SZoPXBMRf1JVVpmXrLYyr+31tX08r6U4\nbnwBxcRLL4iIq1R8ZdtHIuLoivOuB44F5gM3AkdGxHpJBwJfiYjBzPOSjWfqx65TUx9OPi0ilkja\nheLJ8ZiIeFDSFymOi6zStnJf2q8l/Tgi7gOIiAck1fGpdsraoP31tX08d4mItQCSfhERV5V5a8r/\nLCo3eoSOpJ9FxPpy2W11vINpIC/leCZ/7EY1dTjgNoAozkK7LiIeLK9vo/pDhB6U9Ijy56WjC1Wc\n8VTHEzFlbdD++to+np3PwbdOWLdrDXl0NMxXdSzbuSV5Kccz+WPXLTilTZIWAETxtVAAqPhC2Acr\nzvqLiPh1mdX5xNuFYjL3qqWsDdpfX9vH852j/1FExNdGF0p6PPDZGvJWUDaViLi2Y/n+wGktyEs5\nnqkfuzGz6qgSSbsBu0XEnU1vS9XaXBukr6/t42k2mcande0UEffX+USUdNFk1+tUd23Q/vpS56Ue\nT0krJrteQ967J7vegrxk45n6sWu8cUtaM9n1ir1miuuVSlwbtLy+to8nDz1NutbTpkk/33jqvJTj\nmfSxm1W7SlKRtHdE/LLp7aiapD0BUp1sYGbNaPwVd90kHSvpp5KuknSEpFuAayTdIemZNWfvIelR\nNWccIGlY0i+Aa4BrJd1ZLltcZ3YbSXqipEskXSzp8ZI+LekeSdeq+BLhOjKPkXS2pAvLy9mSlk/9\nm5Vvx7tqut9jJL164t+jpFd1/40dypISnrDVJf/yujOguRNwtgAXAF8ELo8aN0LSDcBLKSYKugg4\nLiKuLp+En4+Ip1Sc91iKT8tPoDgwf2O56jzgfVHxRPySvgecAXx5dO6H8lCrvwHeGBF/WmXeFNuy\nNiKWVHyf+1OcabcIuAT40OgYSvpaRLyg4rzvlHkLKB7Hk4EvAcdTjGel/9lLOgM4mOIohDvKxfsB\nL6OY9rTSE4ym2JafRcQBFd/n+4E/pzjm/nnAGRHxkXLdmhqefylP7rtp4iKKx3L0WPVKT9YaF9RQ\n414PfISioS4Gvgx8MSKuriFr7I9D0u0RsX/Huhsi4vCK8y4H/jkiVkl6IfA04B0Ux3k+OiIq/dBC\n0g8j4gnTXbcDeS/stQr4eETsU3HeN4GvAFdTnKW2FHheRNwl6fqIOKLivLH7lPSjiDioY10djeYH\nEXFwl+WimNO56sfvvl6rgPkRUelJeSrORD0iIrap+OKLLwDrI+JNNT1+a3ucsDUPWFNlM5V0IcV/\nDu8FHqAYw/+i+I+KiLitqqyJmjpz8v6I+CjwUUkHUEyXeVb5wA5HxNsqzLpH0muBRwF3S3oTcD7w\nLGBrhTmj9oqIVQARcYGkt0fE/cA7JN1aQ97q8lXGZ9g+teT+FMccX19D3peAz9N9HuKH15C3T0R8\nvPz5JEl/B3xH0vN7bMOO2rnj5w9PWFfHSRW/kXRkRFw3YfmRwG9qyLuH4rTzzRNXSKpjmtx55clS\nRMQ9KuYnOUfSv1PPeI6dsCVp3AlbqvjM14h4vqS/BM4B/iUiLpT0uzob9qimGvfYJ64R8TPgg8AH\nVXw33Ysrzno5xSvePwDPoXiVfxlwG/UcJfCLsrlcAbwQ2ABjr6Dq2Mf2MopXoqeyfWrJOyjmTji3\nhrybKP5Ib564QtKzasjbRdLDI+I3ABHxOUmbKB7D3WrI+5ikBRGxNSLOGl0o6SDgWzXkvYJiQqtH\nsn1Xyf7AveW6qn2W4gsoHtK4KV4NV+3Hkp4eEVcClLvzXi3pvcBf1ZC3qePxq/2ErYj4qor5uN8j\n6dXUfMbkqKZ2lXw4Iv4xeXAC5TuIfwEOBW4A/ndE/FzF3M7LIuIrjW7gDpL0NOC28j/ciesGI2Kk\n4rw3UbzFvXLC8iOAD0bEs6vMa0rZWMbmdI6Ob/zJmco5OyLigS7rFkXExof+Vi3bUfsJW5IOA/6s\n4x1ibebk4YCjJL0rIv656e3YUZKOofhA61udb9MkvSoizmtuy/LUMZ7fjogNHctrGU+l/5Z352WY\n1WnWHQ5Y1yFJPZyYMKuW2spP7d8OLAEul3RSx+o6vhoq6eFdqfNUzME9Op7frns8lf5b3p2XYdZD\nRI3fRDyTC/Cziu/vvh6XX1FM4ZltbeV9rqX4AAiKQx6/Dvyf8vr1NeS9H/gOxSGIPwZO6li3poa8\nDyTOSz2eqb/l3XkZZk28NPLh5FSHJFUcl/RT9MS1QfpP7Z/H9sO73g18QdIfRcSbqOc03+MT56Ue\nz9TfFO68PLPGaeqokpTNNPWn6KkPt0r9qX3qxpY6L/V4pv6mcOflmTVOU0eVvBe4MMbPzzu6bmVE\nnJx8oyqSurbUn9qrmCHvQ/HQozzeC7wtIqr+ztDUecmPglD6b3l3XoZZ43KbaNyWrwb+o5gVh5OZ\nzSZu3GZmmZl1hwOamdnkmvpwMjlJ+1CcVPF74CcRUcc8JY1pe32pNTGeSjyfuvPyzIKGX3FL2kfF\nHNl/ovILYWvIOFTSt4DvUcxX/W/AWhXzLC+sI7PMrb22MqfV9aXOSz2eSjyfuvPyzHqIOg8Sn+TA\n9UMpJuz5EcXEL9cAPwU+DSysOOtq4JDy56cCnyl/fg3FHNbZ1jZH6mv7eH6PYmK1nTuW7UwxY+bV\nzpu9ealrG5dd551PUnCyJwdw44Trazp+XpdzbXOkvraP5w9nss55zeelrq3z0tSukvkRMfotEddS\nzAtBRPwb8McVZ/1Y0jslHS3pdIoZ+1Ax0Xod9aesDdpfX9vHc7WksyQdJemx5eUoFXOs1zGfuvPy\nzBqnqRNwLqAo7HKKOav3iIhXlU+OmyPikAqzdgfeRvGW+0bgtIj4Vbm/8klR8bfupKytzGt7fW0f\nz10p5lM/gfEncVwInBsRv3Xe7MxLXdu47IYad9InR0ptrg0aaWytHk+zmWj9CTgqvjj3RIpDuy6J\niO92rHtHRLy3sY2rQNvrSy31eEp6BMV0sUHxPawvppgT5VaK7y6t9DBE51WXl7q2To3s45a0s6TX\nSnqPpP82Yd07Ko77BPB04C7gI5I6v0ew1xffzlji2qDl9bV9PCmOjhkAHgdcTPFdkx+imF3ubOfN\n6ryUWePV+cnnJJ+4fpJiZr43AquBD3esq3SOZeCmjp/nUXyx5wXAw6hnfuVktc2R+to+njeU/4ri\nW8nVcf0m583evNS1dV6aOqrkqRHxtxFxBnAUsEDSBZIeRvXz2I5N/RkR2yJiBcWRApcDdZzIkbI2\naH99bR/P0awAvl7+O3q9tv2Yzssza1RTjTvlk2NE0vLOBVF8z+SngMUVZ0H6J37b62v7eI6oPBM0\nIsa+ik3S4ym+pcl5szcvdW1jmjqq5HPA5yLi0gnLTwTOjohdkm9URdpcG6Svr+3jORlJioRPUOfl\nk9X6o0q6kXRO+cqtldpeX2qpx9N5+ealypo107pKOidh3GDCrNS1Qcvra/t4Oi/rvCRZs6Zxk3Zw\n70yYBen/UNteX9vH03n55iXJmjW7SiRdGhHLp75lftpcG6Svr+3jaTaVWfOKu64noqSFkk6TdKuk\nLZLukrSuXLZ7HZkT1dlk2l5f6rzU4+m8fPOafO41deZkyoLPB+4GlkXEnhGxFzBULju/4qwmHsxW\n19f28XRe1nmpa9tuR87emekFuAw4Gdi3Y9m+5bJvVJy1fibrcqhtjtTX9vF0XqZ5qWvrvDS1q2Rx\nRKyMiE2jCyJiU0SsBA6sOOs2SW+RNDC6QNKApJOB2yvOgrS1Qfvra/t4Oi/fvNS1jWmqcacs+MXA\nXsCV5VvtLcAqYE/gRRVnQfoHs+31tX08nZdvXuraxjR15uQewCkUE5A/uly8mWIC8pWR6JuS69Dm\n2iB9fW0fT7OZmDWHAzZB0isj4lNNb0dd2l5faqnH03n55tWdNesad+LB/VlEHJAiq8xL/Yfa9vra\nPp7OyzSv7qzZ2LgrLVjSTb1WAQdHxMOqyupjWyp/MNteX+q81OPpvHzzmnzuzavrjiczRcEDPdbN\n1ABwDMWxlROzvvvQm++YxLVBy+tr+3g6L+u81LWNaaRxk7bgi4AFEXHDxBWSVlWcBekfzLbX1/bx\ndF6+ealr237/DR1Vci7wqYi4qsu6L0TE3ybfqIq0uTZIX1/bx9NsJmbdPu4UJK2IiNRTgybT9vpS\nSz2ezss3L1XWrJlkSlLKif//e8Ks1LVBy+tr+3g6L+u8JFmzpnGTdnDr+JLZyaT+Q217fW0fT+fl\nm5ckazY17pSD+7yEWZD+D7Xt9bV9PJ2Xb16SrNnUuGspWNJRkh5V/jxf0qnA2ZJWSlpYR2YXtT2Y\nba8vdV7q8XRevnlNPveamo87ZcHnAb8ufz4TWAisLJdVftZdAw9mq+tr+3g6L+u81LVtV+ecsb0u\nwC3AvPLnc4AzgD8H/gm4oOKsdR0/r5mw7oaca5sj9bV9PJ2XaV7q2jovTe0q2SkitpU/D0bEGyPi\nqog4FfijirNulvTK8ucbJQ0CSDoY+F3FWZC2Nmh/fW0fT+flm5e6tu3q/F9hkv+p/h14Zfnzpyie\nkAAHA9dVnLUQ+DTwY+CackB/AlwJHJZzbXOkvraPp/MyzUtdW+elqTMnF1LsE3oa8EvgKRST4t8O\n/ENE3FhD5qOAx1Gc5n9HRGyuOqPMSV5bmdvK+to+ns7LPy91bdDwmZNNFDwhf0FEbK3pvhutrdyG\n1tTX9vF0Xrvy6s6adae8Jx7c1NOQpv5DbXt9bR9P52WaV3dWU7MDTub7QJXzcf9jr1XAgqpy+lRp\nbdD++lLnpR5P5+Wb1+Rzr6n5uFMW/H7gQ8C2LusqP6qmgQez1fW1fTydl3Ve6trGNPWKO2XBa4Cv\nRcTqiSsknVhxFqR/MNteX9uGAUfSAAACoUlEQVTH03n55qWubbs6D1mZ5DCa7wJLe6y7veKsQ4B9\neqwbyLm2OVJf28fTeZnmpa6t89LU4YCHAFsi4hdd1g1EA0cMVKXNtUH6+to+nmYz0ciZkxGxvtsT\nsVxX9RN/oaTTJN0qaYukuyStK5ftXmUWpK0N2l9f28fTefnmpa6tU1OTTKUs+HyK7ytcFhF7RsRe\nwFC57PyKs5p4MFtdX9vH03lZ56Wubbs698NMsm/oMuBkYN+OZfuWy75Rcdb6mazLobY5Ul/bx9N5\nmealrq3z0tQkU4sjYmVEbBpdEBGbImIlcGDFWbdJeoukgdEFkgYknUxx2nTVUtYG7a+v7ePpvHzz\nUtc2pqnGnbLgFwN7AVeWb7W3AKuAPYEXVZwF6R/MttfX9vF0Xr55qWsb09RRJXsApwAnAI8uF28G\nLgRWRsSW5BtVkTbXBunra/t4ms3ErJurpA6SnggsAq6OiPs7li+PiEub27JqtL2+1FKPp/PyzWvs\nuVfnDvQpduw/EXgmsNuE5csrzvkHYD3wNWADcELHujVVZqWubS7U1/bxdF6+eU0898buv847nw0F\nA2uBBeXPi4ER4A3l9etzrm2O1Nf28XRepnmpa+u8NDVXyWsoTmPeKmkx8GVJiyPiTIrJg6q0U5RT\nf0bEBknLyrwDa8iCtLVB++tr+3g6L9+81LVtD67zzifL7SwYWAYcK+nDVF/wZkmHj14pc48H9gaW\nVJwFaWuD9tfX9vF0Xr55qWvbrs6X85O8xbgcOHzCsnnAZ4HfV5y1Hx0nb0xYd3TOtc2R+to+ns7L\nNC91bZ2Xpg4H3A/YFh0nVXSsOzoi/l/yjapIm2uD9PW1fTzNZmJOHA5oZtYmTe3jNjOzGXLjNjPL\njBu3mVlm3LjNzDLz/wEJ5lbocofr5wAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "temp_series.plot(kind=\"bar\")\n", "\n", "plt.grid(True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Resampling\n", "Pandas lets us resample a time series very simply. Just call the `resample()` method and specify a new frequency:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DatetimeIndexResampler [freq=<2 * Hours>, axis=0, closed=left, label=left, convention=start, base=0]" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_freq_2H = temp_series.resample(\"2H\")\n", "temp_series_freq_2H" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The resampling operation is actually a deferred operation, which is why we did not get a `Series` object, but a `DatetimeIndexResampler` object instead. To actually perform the resampling operation, we can simply call the `mean()` method: Pandas will compute the mean of every pair of consecutive hours:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "temp_series_freq_2H = temp_series_freq_2H.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot the result:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW4AAAFbCAYAAAD1FWSRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAHHNJREFUeJzt3XmUZnV95/H3RxozCtqIVNDI0sQF\nNaMstkvGTEQZFVwzJio6o6hoZ2binoySjGMgOg4dz3jweCKmVQRHDTHGKEEFNUQdIovN4sLmFhA4\nARqQUTNxQT/zx70F1dVV9dxq+un7+z58XufUoZ57H4oPt3/16fvc5Xdlm4iIqONuYweIiIjVSXFH\nRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiIYlLcERHFpLgjIopZM40futdee3ndunXT+NERETPp\nwgsvvMn23JD3TqW4161bx+bNm6fxoyMiZpKkq4e+N4dKIiKKSXFHRBST4o6IKCbFHRFRTIo7IqKY\nFHdERDEp7oiIYlLcERHFTOUGnKhj3bGfmurPv+qEp0/150fcFWWPOyKimEHFLWkPSR+TdIWkyyX9\n+rSDRUTE0oYeKnkncKbt35F0d+CeU8wUERErmFjcktYCvwm8BMD2T4GfTjdWREQsZ8ihkgOALcAH\nJF0s6X2Sdlv8JkkbJG2WtHnLli07PGhERHSGFPca4FDgJNuHAP8MHLv4TbY32V5ve/3c3KApZSMi\nYjsMKe5rgWttn9+//hhdkUdExAgmFrft64FrJB3YLzocuGyqqSIiYllDryp5FfDh/oqS7wIvnV6k\niOFyA1HcFQ0qbtuXAOunnCUiIgbInZMREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQx\nKe6IiGLy6LI7KXfuRcTOlj3uiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQx\nKe6IiGJS3BERxaS4IyKKSXFHRBST4o6IKCbFHRFRTIo7IqKYFHdERDGDHqQg6Srgh8DPgdtsr59m\nqIiIWN5qnoDzRNs3TS1JREQMkkMlERHFDC1uA5+VdKGkDUu9QdIGSZslbd6yZcuOSxgREVsZWty/\nYftQ4Ejg9yT95uI32N5ke73t9XNzczs0ZERE3GHQMW7b1/X/vFHS3wCPAb40zWARdwXrjv3UVH/+\nVSc8fao/P8YxcY9b0m6S7jX/PfAU4BvTDhYREUsbsse9N/A3kubf/xHbZ041VURELGticdv+LnDQ\nTsgSERED5HLAiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQxKe6IiGJS3BER\nxaS4IyKKSXFHRBST4o6IKCbFHRFRzGqe8j4103wKSJ4AEhGzJnvcERHFpLgjIopJcUdEFJPijogo\nJsUdEVFMijsiopgUd0REMSnuiIhiUtwREcUMLm5Ju0i6WNIZ0wwUERErW80e92uAy6cVJCIihhlU\n3JL2AZ4OvG+6cSIiYpKhe9wnAm8AfjHFLBERMcDE2QElPQO40faFkg5b4X0bgA0A++233w4LGBHt\nmubMnpDZPZczZI/78cCzJF0FnAY8SdKHFr/J9ibb622vn5ub28ExIyJi3sTitv2HtvexvQ44Cjjb\n9n+cerKIiFhSruOOiChmVU/Asf0F4AtTSRIREYNkjzsiopgUd0REMSnuiIhiUtwREcWkuCMiiklx\nR0QUk+KOiCgmxR0RUUyKOyKimBR3REQxKe6IiGJS3BERxaxqkqmIiFkyzQdBTPMhENnjjogoJsUd\nEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyKOyKimBR3REQxE4tb\n0r+SdIGkr0q6VNLxOyNYREQsbcjsgD8BnmT7R5J2Bc6R9Bnb5005W0RELGFicds28KP+5a79l6cZ\nKiIiljfoGLekXSRdAtwIfM72+Uu8Z4OkzZI2b9myZUfnjIiI3qDitv1z2wcD+wCPkfSvl3jPJtvr\nba+fm5vb0TkjIqK3qqtKbN8K/D1wxHTiRETEJEOuKpmTtEf//T2AJwNXTDtYREQsbchVJfcHTpW0\nC13Rf9T2GdONFRERyxlyVcnXgEN2QpaIiBggd05GRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiI\nYlLcERHFpLgjIopJcUdEFJPijogoJsUdEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiiklxR0QU\nk+KOiCgmxR0RUUyKOyKimBR3REQxKe6IiGJS3BERxaS4IyKKSXFHRBQzsbgl7Svp7yVdJulSSa/Z\nGcEiImJpawa85zbg921fJOlewIWSPmf7silni4iIJUzc47b9T7Yv6r//IXA58IBpB4uIiKWt6hi3\npHXAIcD5S6zbIGmzpM1btmzZMekiImIbg4tb0u7AXwOvtf2Dxettb7K93vb6ubm5HZkxIiIWGFTc\nknalK+0P2/74dCNFRMRKhlxVIuD9wOW23zH9SBERsZIhe9yPB14EPEnSJf3X06acKyIiljHxckDb\n5wDaCVkiImKA3DkZEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiiklxR0QUk+KOiCgmxR0RUUyK\nOyKimBR3REQxKe6IiGJS3BERxaS4IyKKSXFHRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiIYlLc\nERHFpLgjIopJcUdEFJPijogoZmJxSzpZ0o2SvrEzAkVExMqG7HGfAhwx5RwRETHQxOK2/SXglp2Q\nJSIiBsgx7oiIYnZYcUvaIGmzpM1btmzZUT82IiIW2WHFbXuT7fW218/Nze2oHxsREYvkUElERDFD\nLgf8C+Bc4EBJ10o6ZvqxIiJiOWsmvcH2C3ZGkIiIGCaHSiIiiklxR0QUk+KOiCgmxR0RUUyKOyKi\nmBR3REQxKe6IiGJS3BERxaS4IyKKSXFHRBST4o6IKCbFHRFRTIo7IqKYFHdERDEp7oiIYlLcERHF\npLgjIopJcUdEFJPijogoJsUdEVFMijsiopgUd0REMSnuiIhiUtwREcWkuCMiihlU3JKOkHSlpG9L\nOnbaoSIiYnkTi1vSLsCfAUcCDwdeIOnh0w4WERFLG7LH/Rjg27a/a/unwGnAs6cbKyIiliPbK79B\n+h3gCNsv71+/CHis7Vcuet8GYEP/8kDgyh0fF4C9gJum9LN3huQfV/KPq3L+aWff3/bckDeu2VH/\nRdubgE076uctR9Jm2+un/d+ZluQfV/KPq3L+lrIPOVRyHbDvgtf79MsiImIEQ4r7K8CDJR0g6e7A\nUcDp040VERHLmXioxPZtkl4JnAXsApxs+9KpJ1ve1A/HTFnyjyv5x1U5fzPZJ56cjIiItuTOyYiI\nYlLcERHFpLgjIorZYddxx7YkPZTuLtMH9IuuA063ffl4qYarnn8WSNqbBdvf9g1j5lkNSaK783rh\n+LnARU6stbztmz45KempwG+x9R/8J22fOV6qYSS9EXgB3RQB1/aL96G7nPI02yeMlW2I6vmh/Pg5\nGHgPsJY77pvYB7gV+C+2Lxor2xCSngK8G/gWW+d/EF3+z46VbZIK277Z4pZ0IvAQ4INsXRwvBr5l\n+zVjZRtC0jeBX7P9s0XL7w5cavvB4yQbZgbyVx8/lwC/a/v8RcsfB/y57YPGSTaMpMuBI21ftWj5\nAcCnbT9slGADVNj2LR8qeZrthyxeKOkvgW8CTf/iAb8AfgW4etHy+/frWlc9f/Xxs9vi4gCwfZ6k\n3cYItEpruOMvzIWuA3bdyVlWq/lt33Jx/1jSo21/ZdHyRwM/HiPQKr0W+DtJ3wKu6ZftR/dR8ZXL\n/lvtqJ6/+vj5jKRP0X1imN/++9J9Ymj+UA9wMvAVSaexdf6jgPePlmqY5rd9y4dKDgVOAu7FHX9z\n7wv8X+D3bF84VrahJN2NbU/OfMX2z8dLNVzl/DMyfo5k6ZPDnx4v1XD9vP3PYtv8l42XapjWt32z\nxT1P0v3Y+szu9WPmWY0ZOKteOj/UHj+zQtKeALZvGTvLrGj5UAmS1gJPYMEvnqSzbN86YqxBVjqr\nLqnps+pQPz+UHz9rgT+k2+vbGzBwI/BJ4ITW/x8k7Qf8KfAkuk85knRv4Gzg2MUnLVtSYds3u8ct\n6cXAHwOfZevieDJwvO0PjpVtiMpn1WEm8lcfP2fRldyp858S+k8PLwGeZPspI8abSNK5wInAx+YP\nrfWPQXwu8Frbjxsz30oqbPuWi/tKuift3Lpo+X2A85e6YqAl/Um9h9m+bdHyuwOX2X7QOMmGmYH8\n1cfPlbYPXO26Vkj61nKXjK60rgUVtn3Lh0pE9xFlsV/061pX+aw61M9fffxcLekNdHt9N8Dtd/K9\nhDv+PFp2oaR3A6ey9fg5Grh4tFTDNL/tW97jPhp4M91H3YWXoz0ZeIvtU0aKNpikh7H0menmz6pD\n7fzVx0//yeBYtj7OegPdQ0w2tn6ir/9kdgxbj59rgb8F3m/7J2Nlm6TCtm+2uOH2DfhUti6Os2x/\nf7xUUUXGT8yqpot7Vkg6zvZxy71uXfX81Uk6dOH8GItft07SM2yfsdzrlrW67UtM6ypp00qvC1h8\ns0fzN38sUjr/DIyf/zzhdesePeF1y5rc9iX2uCU9auGdbotfR6wk4ydmTYnirkjSGrqTM/+ebrIm\n6KcVpTs587Pl/t0WVM8/C/obQY5g22P0o98AMkTl+dxb3/bNHiqRtFbSCZKukHSLpJslXd4v22Ps\nfAP8b+Bg4Djgaf3X8cBBwIfGizVY6fzVx09/A9FFwGHAPfuvJ9JdZvfiEaMNom4+99PoLr28oP8S\n8BeSjh0z2yQVtn2ze9wr3L10NHB4C3cvrUTSN5e7yWOlda2YgfzVx0/1G4jKzudeYds3u8cNrLO9\nceGkQLavt70R2H/EXEPdIum56mbYA7rZ9iQ9H6hwOVr1/NXHT/UbiObnc1+swnzuzW/7lu+cbP7u\npQmOAjYC75b0fbo/8D3o9gKPGjPYQNXzVx8//wO4SNKSNxCNlmq4yvO5N7/tWz5UsvjuJYDraeju\npaEk3RfA9s1jZ9keFfPPwvipfgORas/n3vS2b7a4Z8EyZ9U/afuK8VINVz3/LFDDTxqfRKo9n3vL\n277p4lbtp3SXfkp69fxQfvwsfNL4tXSHqpp60vhKNDtPeW9y2zdb3Kr/lO6yZ9VhJvJXHz/NP2l8\nJSo8n3uFbd/yycnqT+mu/pT06vmrj5/mnzQ+QZ7yPkUtF3f1p3RXPqsO9fNXHz/NP2l8gsrzuTe/\n7Vs+VDILT+kue1YdauefkfHT9JPGJ1Ht+dyb3vbNFvc85SndcSdk/MQsavnOSeD2u90u7L9K/tJJ\nOmOl162rnH9Gxs+GlV63TtJxK71uWavbvvniBpB00UqvC3jFhNetK51/BsbP4tusm7jtehUqz+fe\n5LZv/lDJLJG0l+2bxs6xWpL2BKhwt2HEXUGJPW4ASfeW9Kj+VtTmSTpS0j9KOkfSIZIuBc6XdK2k\nw8fON4mk/SSdJmkLcD5wgaQb+2Xrxk131yDpqZJOknR6/3WSpCPGzjVUn/+YxeNF0svGSbT9JJ09\ndoaFmt3jlvQh4LW2b+rvgHsv3fW3Dwb+wPZfjRpwgv4i/hfQTcx0BvD0/jrQhwEftn3oqAEnkHQu\ncCLwsfmrSCTtAjyX7s/lcWPmm0TSvsDb6U5MfgZ4+/zNRJI+Yfu3xsw3yQzcQPQ24Dfo5rV+JnCi\n7Xf16y5qefxL+triRXR/FlcC2H7kTg+1SMvF/XXbj+i//zLwQttXSdoL+LsW7l5aycLBKeka2/su\nWHeJ7YPHSzeZpG8td3fkSutaIelzwF8D59E9yedRwDNt3yzpYtuHjBpwAi0z53k//8c3C2z/rwOH\n2L5N3YMrPgJcaft1rW9/SacDPwDeCvwLXXH/H7q/iLC9+Ka0na7lQyV3k3Tv/vtfAN8D6I8Rt3zj\n0LxbJf2upP8KfF/S6yQ9QNLRwI/GDjfAhZLeLemxkn6l/3qspHcDF48dboA52++xfYntV9HNm/El\nSQ9k6bmWW/NjSUs9VLfKDURrbN8G0D+Q4JnAvSX9FXD3UZNNYPtZdH/pbwIO6m/b/5ntq1sobWh7\nj/t5wBuBPwMOpLtj73S6RwjdbPv3R4w3Uf9R/U10f+kcT3fY5Bi6W8j/wI0/d6+fk+QYtr4J4Vrg\nb+meOfmTsbIN0Z9TeJTtHy9Y9u/oJg/azfb9Rws3QPUbiPpLRt9u+4uLlr8V+CPbLe80AtDf3v4W\n4IF0Y2mfkSPdrtniBpD0ILpLzx7CHXMffML2WaMGi+ZJeh1w0RLFcQjwp7afPE6y1al6A5GkewDY\n/pcl1j3A9nXb/lttknQQ8Ou23zN2lnlNF/eskvRm238ydo5J+pPC+wCfX/gRUdLLbJ88XrK7BjX+\npPFJKudvPXvzH1eWIunNY2e4k14+doBJ+qsC/hvwCOBsSa9asLrCJFOlL0dTgSeNr6Ry/grZS+5x\nS/qe7f3GzrESST9YbhVwD9tNn2CtfFUAgKT/CTyegpejASWeNL6SyvkrZG+2PCYV387Msp1uBR7t\nJR53JKnCw2q3uipA0jOBTRWuCug9gzv+4jkO+IikX7X9Ohq5bXmC5p80PkHl/M1nb7a4qV98HwT2\nB5Z6Tt1HdnKW7fEdSU+YP7nX34RzTH9VwG+PG22Q6n/xNP+k8Qkq528+e7OHSvqCON32BUus22j7\njSPEusuoflXAjFyO1vSTxiepnL/17M0Wd8SdUf0vnoiVpLgjIopp/uNiRERsreWTkzNB0hzdTSw/\nB75ru8I8Jbernn8WqPh86JXzt5q9+T1uSXPq5rN+pKTdx84zlKSHS/o8cC7dfNbvBb4u6ZT+rqym\nVc8/r/D4KT0feuX8JbLbbvILeDjweeDbwE/pNuA/AqcAa8fONyD/ecCB/fePAU7tv38F3RzXo2ec\n8fzVx8+5wPOBXRYs2wU4Cjhv7HyznL9C9pb3uE+mmwXtQXTz4F5h+wDgH4D3j5psmHvYnp94/QK6\nW8ex/V7g18YMNlD1/NXHz162/9L9Qyygu5be9mnAfUfMNVTl/M1nb7m4qxfHdyT9d0mPl/S/gEsA\nJO1K29t9XvX81cdP9fnQK+dvPnuzlwNK+jjdRjobeA5wH9sv64vjG7YPHDXgBP38Hn9E95H9q8AJ\ntn/YHx9+mO3zRg04wQzkrz5+lpoP/Tq6OekrzIdeNn+F7C0Xd+niiHFl/MQsa7a4q1P3YN2X011K\n9xnbX16w7k223zpauAGq569O0j3pps818C66k2W/DVwB/Ikbvyyzcv4K2Zs9VilpF3XPbHyLpH+z\naN2bxsq1Cn8OPAG4GXiXpHcsWPeccSKtSun8MzB+TgH2Bg4APkX3rMm3081Od9J4sQY7hbr5T6Hx\n7M3ucUt6H90E5hcALwK+aPv1/boK8yl/zfYj++/X0D2sdi+6Z0+e5/bns66ev/r4ucT2wZIE/BNw\nf9vuX391/s+mVZXzV8je7B438BjbL7R9IvBYYHdJH5f0SzQyJ+4Et08davs22xvorsw4G6hwI0j1\n/NXHDwDu9qw+3f9z/nWbe1tLqJy/5ewtF3f14tgs6YiFC9w9Z/IDwLpREq1O9fyzMH52B7B9+6PW\nJD0Q+OFoqYarnL/57C0fKvkQ8CHbZy5a/nLgJNu7jpMsKpjl8SNJbvUXd4DK+VvJ3mxxzyJJm/o9\nv5Kq56+u+vavnL+17C0fKtmGpE1jZ7iT1o8d4E4qnT/jZ3SV8zeVvVRx09jG2w43jh3gTqqeP+Nn\nXJXzN5W91KESSWfaPmLyOyO2lfETs6LUHnelXzpJayWdIOkKSbdIulnS5f2yPcbON0n1/EvJ+Nl5\nKuevkL3Z4q6w8Sb4KPB94DDbe9q+L/DEftlHR002TOn8GT+jq5y/+ezNHiqRdBbdNben2r6+X3Y/\n4GjgcNtPGTPfJJKuXG4GupXWtWIG8mf8jKhy/grZm93jBtbZ3jj/Swdg+3rbG4H9R8w11NWS3iBp\n7/kFkvaW9EbgmhFzDVU9f8bPuCrnbz57y8Xd/Mab4Pl0T8v4Yv9R/RbgC8CewPPGDDZQ9fwZP+Oq\nnL/57C0fKrkPcCzdZOa/3C++gW4y841u7KnL0ZaMn5hlzRb3LJP0UtsfGDvH9qqev7rq279y/lay\nlyzuVjbe9pL0Pdv7jZ1je81A/oyfEVXO30r2qsXdxMZbiaSvLbcKeIjtX9qZeVarev6VZPxMX+X8\nFbKvGTvAciZsvL2XWdeSvYGn0l37uZCAL2/79uaUzp/xM7rK+ZvP3mxxU2DjTXAGsLvtSxavkPSF\nnR9n1arnz/gZV+X8zWdv9lCJpPcDH7B9zhLrPmL7hSPEiiIyfmKWNXsdt+1jlvql69eV/KWT1Mx8\nvtujUv6Mn/ZUzt9a9maLeymtbbzt8J/GDnAnlc6f8TO6yvmbyl6quGls422HMg+pXUb1/Bk/46qc\nv6ns1Yq7qY23HZ45doA7qXr+jJ9xVc7fVPZqxd3UxluJpMdKunf//T0kHQ+cJGmjpLUjx5uoev5l\nZPzsJJXzV8jebHFX2HgTnAz8v/77dwJrgY39sgp37ZXOn/Ezusr5m8/e8nXcJwMH9d+/k26jbQQO\np9t4zxkp11B3s31b//1624f2358jaZvrQxtUPX/Gz7gq528+e7N73Gy78V5r+xzbxwO/Omawgb4h\n6aX991+VtB5A0kOAn40Xa7Dq+TN+xlU5f/PZWy7u5jfeBC8HniDpO8DDgXMlfRd4b7+uddXzZ/yM\nq3L+5rO3fOfkWrqPuP8WuAk4lG4C/GuAV9v+6ojxBuuPsx5Ad1jqWts3jBxpVarmz/hpQ+X8LWdv\ntrjntbzxtpek3W3/aOwc26tS/oyf9lTO30r25ot7Ka1svO1VYVrRlcxA/oyfEVXO30r2lq8qWcll\nwOgbbyWSXr/cKmD3nZlle1TPP0HGz5RVzl8he7PFXWHjTfA24O3AbUusa/mk8LzS+TN+Rlc5f/PZ\nmy1uCmy8CS4CPmH7wsUrJDVxZnqC6vkzfsZVOX/z2Zs9xi3py8Crltl419jed4RYg0k6ELjF9pYl\n1u3d+kmyGcif8TOiyvkrZG+5uJvfeNGujJ+YZc1+ZLR95VK/dP265n/pJK2VdIKkKyTdIulmSZf3\ny/YYO98k1fNn/Iyrcv4K2Zst7gobb4KP0j3v8DDbe9q+L/DEftlHR002TOn8GT+jq5y/+ewtHyo5\nCzgbONX29f2y+wFHA4fbfsqY+SaRdKXtA1e7rhUzkD/jZ0SV81fI3uweN7DO9sb5XzoA29fb3gjs\nP2Kuoa6W9AZJe88vkLS3pDfS3Xbduur5M37GVTl/89lbLu7mN94EzwfuC3yx/6h+C/AFYE/geWMG\nG6h6/oyfcVXO33z2lg+V3Ac4Fng28Mv94huA04GNtm8ZK1u0L+MnZlmzxT0LJD0UeABwnu1/XrD8\nCNtnjpdsmOr5q6u+/Svnbz17y4dKkPRQSYdL2m3R8iPGyjSUpFcDnwReBVwq6dkLVr9tnFTDVc8P\nGT9jqpy/RHbbTX4BrwauBD4BXAU8e8G6i8bONyD/14Hd++/XAZuB1/SvLx47310gf8ZP8s9s9pbn\nKnkF8CjbP5K0DviYpHW230k3UVDr7uZ+6lDbV0k6jO7/YX+Sf2fI+BlX5fzNZ2/5UMlWGw84DDhS\n0jtoZONNcIOkg+df9P8vzwD2Ah4xWqrhqufP+BlX5fzNZ2/25KSks4HX275kwbI1dE/v/g+2dxkt\n3ACS9gFu84LriBese7ztfxgh1mAzkD/jZ0SV81fI3nJxN7/xol0ZPzHLmi3uiIhYWsvHuCMiYgkp\n7oiIYlLcERHFpLgjIor5/4Xj0noYGwK1AAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "temp_series_freq_2H.plot(kind=\"bar\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note how the values have automatically been aggregated into 2-hour periods. If we look at the 6-8pm period, for example, we had a value of `5.1` at 6:30pm, and `6.1` at 7:30pm. After resampling, we just have one value of `5.6`, which is the mean of `5.1` and `6.1`. Rather than computing the mean, we could have used any other aggregation function, for example we can decide to keep the minimum value of each period:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 16:00:00 4.4\n", "2016-10-29 18:00:00 5.1\n", "2016-10-29 20:00:00 6.1\n", "2016-10-29 22:00:00 5.7\n", "2016-10-30 00:00:00 4.7\n", "2016-10-30 02:00:00 3.9\n", "2016-10-30 04:00:00 3.5\n", "Freq: 2H, dtype: float64" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_freq_2H = temp_series.resample(\"2H\").min()\n", "temp_series_freq_2H" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or, equivalently, we could use the `apply()` method instead:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 16:00:00 4.4\n", "2016-10-29 18:00:00 5.1\n", "2016-10-29 20:00:00 6.1\n", "2016-10-29 22:00:00 5.7\n", "2016-10-30 00:00:00 4.7\n", "2016-10-30 02:00:00 3.9\n", "2016-10-30 04:00:00 3.5\n", "Freq: 2H, dtype: float64" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_freq_2H = temp_series.resample(\"2H\").apply(np.min)\n", "temp_series_freq_2H" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Upsampling and interpolation\n", "This was an example of downsampling. We can also upsample (ie. increase the frequency), but this creates holes in our data:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 17:30:00 4.4\n", "2016-10-29 17:45:00 NaN\n", "2016-10-29 18:00:00 NaN\n", "2016-10-29 18:15:00 NaN\n", "2016-10-29 18:30:00 5.1\n", "2016-10-29 18:45:00 NaN\n", "2016-10-29 19:00:00 NaN\n", "2016-10-29 19:15:00 NaN\n", "2016-10-29 19:30:00 6.1\n", "2016-10-29 19:45:00 NaN\n", "Freq: 15T, dtype: float64" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_freq_15min = temp_series.resample(\"15Min\").mean()\n", "temp_series_freq_15min.head(n=10) # `head` displays the top n values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One solution is to fill the gaps by interpolating. We just call the `interpolate()` method. The default is to use linear interpolation, but we can also select another method, such as cubic interpolation:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "2016-10-29 17:30:00 4.400000\n", "2016-10-29 17:45:00 4.452911\n", "2016-10-29 18:00:00 4.605113\n", "2016-10-29 18:15:00 4.829758\n", "2016-10-29 18:30:00 5.100000\n", "2016-10-29 18:45:00 5.388992\n", "2016-10-29 19:00:00 5.669887\n", "2016-10-29 19:15:00 5.915839\n", "2016-10-29 19:30:00 6.100000\n", "2016-10-29 19:45:00 6.203621\n", "Freq: 15T, dtype: float64" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_freq_15min = temp_series.resample(\"15Min\").interpolate(method=\"cubic\")\n", "temp_series_freq_15min.head(n=10)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD7CAYAAACG50QgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xd4FFX3wPHv2fQGARJqgNA7hCI9\n0pQiCAgW7KiIFMtPBctr91XBF7uCiIAUsYGCiojYEEEEEghFaSJIh5BAKElIu78/ZsGIoWU3md3s\n+TzPPsnuzM45s9mcuXNn5o4YY1BKKeUbHHYnoJRSqvho0VdKKR+iRV8ppXyIFn2llPIhWvSVUsqH\naNFXSikfokVfKaV8iBZ9pZTyIVr0lVLKh/jbncCZoqKiTGxsrN1pKKWUV0lMTDxkjIk+33weV/Rj\nY2NJSEiwOw2llPIqIvLXhcyn3TtKKeVDtOgrpZQP0aKvlFI+xOP69JVSBcvOzmb37t1kZmbanYqy\nUXBwMDExMQQEBBTq/Vr0lfISu3fvJiIigtjYWETE7nSUDYwxpKSksHv3bmrUqFGoZWj3jlJeIjMz\nk3LlymnB92EiQrly5Vza29OWvrfJzoQ/F8Om+bDlG8jJhJBICCkLIWUg1PkzpCyEl4fq7aF8Q9BC\nUSJowVeufge06HuDjMOw9Vur0G/9DrJPQGAE1LkcwitARqo1T3oqHNlp/Z55BEye9f7wClCzC9Tq\nCjU7Q0QFO9dGKWUjLfqe7K/l8NNY2LEU8nIgvCI0uw7q94bYePAPOvt78/Lg6B7Y/hNs+wH++BbW\nfQRAbnQj9kW141DFeKo1v5yypcKKaYWUt/Pz86NJkybk5OTQoEEDpk+fTmho6AW/f8iQITzwwAM0\nbNjwguafNm0aCQkJvPXWW+ecr2fPnvz666907NiR+fPnFzhP586deemll2jVqtUF51sSadH3VOvn\nwLzhEFYe2t0N9ftAlZbguMDDMA4HRFYlq8kNbI6+kqTKKRzcmkCpPUtotD+RlgenEbNxMmk/hLLI\n/xL2VOiCX93LaVyzCg0rlSI4wK9o1095pZCQEJKSkgC48cYbmThxIg888MAFvTc3N5fJkycXSV6j\nR48mPT2dd955p0iWfy65ubn4+XnP/4sWfU9jDPzyJnz7BFTvAINmWX30F/RWw67UDNbsOszaXWkk\n7TrMhr1HycqxunmiwssQV/VWTla9D1MpkHL7f4HNX9H+4E+E7/2Jk3ue45fvG/E8rdgZ3Znq1WvS\nLCaSuGqR1CgXhsOh/cme4pkvf+P3vUfdusyGlUvx1JWNLnj++Ph41q1bB8D777/PG2+8QVZWFm3a\ntGHChAn4+fkRHh7OXXfdxXfffcf48eN5/PHHT7e2P/zwQ1544QWMMfTu3ZsXX3wRgPfee48xY8YQ\nGRlJs2bNCAo6xx6tU7du3Vi8ePF555s9ezYjRozgyJEjTJkyhfj4eDIzMxk+fDgJCQn4+/vzyiuv\n0KVLl3/tZfTp04dRo0bRuXPnf61Xx44dL/hzs5sWfU+Slwvf/AdWTIRGV0H/iRAQfNbZj6RnkbTr\nyOkCv3Z3GqknsgAIDnDQpEppbmlbnbhqkcRVjaRKZMg/DwI1qA5drrfi7lpBztrPab15AV1OTIHU\nKfyeEsviVU15LK8pWwIb0qhqFHFVrWU1qxpJVPj5/xlVyZSTk8PXX39Nz5492bhxIx9//DHLli0j\nICCAESNGMGvWLG655RZOnDhBmzZtePnll//x/r179/Lwww+TmJhImTJl6N69O/PmzaNNmzY89dRT\nJCYmUrp0abp06ULz5s0B+OKLL0hISODZZ591Ke+VK1eyYMECnnnmmdNFW0RYv349mzZtonv37mzZ\nsuWcyznbenkDLfqeIjsDPhsKG7+AtiOh+3P/6Mo5mZPLxn3HSNp52Cr0u9PYfugEYJ2YU6d8ON3q\nlyeuWiTNYiKpVzGCAL8L7Qryg+rtCaveHsxYSN4EmxfQYOt3NNi9gBF5X3DSEcKa/U34ensjns1t\nwl+mIjFlQk5vBOKqRtKocmlCAr1nN9ebXUyL3J0yMjKIi4sDrJb+HXfcwaRJk0hMTOSSSy45PU/5\n8uUB6xjAwIED/7WcVatW0blzZ6KjrUEhb7zxRpYsWQLwj9evu+660wW4b9++9O3b16X8BwwYAEDL\nli3ZsWMHAEuXLuWee+4BoH79+lSvXv28Rf9s6+UNtOh7gvRU+PB62LUCeryAaTuCHSnpJO06TNLO\nIyTtTmPj3qNk5VrdNOUjgoirGsk1rWKIi4mkSUxpIoILd3Xev4hA+QZQvgES/yBkHoUdPxP0x/e0\n3fY9bbNXgj8cC67EH351WPlnVZZsiGFCXg3SHKWpXzHi9J5A86qR1IoOd71byBjITLMOTB/dC9np\nkJttHdzOzYbcrL9/9w+Cik2hUlMICHHPZ6JOy9+nf4oxhltvvZUxY8b8a/7g4GCP6u8+1VXk5+dH\nTk7OOef19/cnLy/v9PP858Z72npdDC36djv8F7kzB8CRnSyo9wKzN7Zi7aJvScvIBiA00I8mVUpz\nW4dYq0VdLZKKpYKL73zt4FLW2UL1e1vPU7bBth+I2LGU5vvW0vz4Eu4KtCYdDazAlvSarEiqyi+r\nolhAGFkBpalUoSKxVWOoH1uVZtWjKF/K2WWVc9La4GWk/vPn8YNwdDek7YG03Vaxzzp+cXmLH1Ro\nCJVbWAfAq7SA6Abgp195d+vWrRv9+vXj/vvvp3z58qSmpnLs2DGqV69+1ve0bt2ae++9l0OHDlGm\nTBk+/PBD7rnnHlq3bs19991HSkoKpUqVYvbs2TRr1qxI84+Pj2fWrFl07dqVLVu2sHPnTurVq8fR\no0eZMGECeXl57Nmzh5UrVxZpHsVF/wNsdPjATszETvjlnWRI1iMkrqtO3QqZ9Gpc8XRruU75cPwv\ntJumOJSrZT1a32k9zzgC+9fBvrWU2ptEq31raZn5KxJo/n7PQecjEY6aEA5ICBGkE8rZrypMIZKD\njigOSjkOSB0OBkRxUKznGRJCDn7Ohz+5Yv3MwY8Qk0m9vG00yNtKg4N/0GD/HCJWTwcgk0D+KBNP\nvetfJKB8naL7jHxMw4YNee655+jevTt5eXkEBAQwfvz4cxb9SpUqMXbsWLp06XL6QG6/fv0AePrp\np2nXrh2RkZGnu5Lg3H368fHxbNq0iePHjxMTE8OUKVPo0aPHBeU/YsQIhg8fTpMmTfD392fatGkE\nBQXRoUMHatSoQcOGDWnQoAEtWrS4yE/GM4kx5vxzFaNWrVoZX7mJypY3+lMtZSlzW82kRqPWNKlS\nmrCgErAdPnkcjh+wLhLLOAIZh8k+nkJy8n5SDx0g/dgRMhxhnPArRbpfKefP0qd/HvcrTY7DTQeJ\njSEqew/VMjZS+dh62hxdSLBkkxN3C0HdHoWIiu6JUww2btxIgwYN7E5DeYCCvgsikmiMOe9FCCWg\nwninjPWfUzf1R+aWG8L1V/ayOx33Cgq3HvkEAJWdj+LXHOgDwJe/JHHk6+e5PmkmeRs+xtFuBHS4\nD4JL25KZUsXNg/oNfEhmGnlfPsjvedWp0e8Ru7PxKVe2j6P24In051UW5baAn1+G15tZ10Zk65DF\nquRzqeiLSKSIzBGRTSKyUUTanTFdROQNEflDRNaJSMnoFHNR7rfPEJyVwgcVHiSu+nnvY6zcrF2t\ncrwxciBjQ0fTP2cMByMawqLH4c2W1pAXSpVgrrb0XwcWGmPqA82AjWdM7wXUcT6GAm+7GM/77VyB\nI3Eq03J60KP7FXZn47NqRofz2YgOBMY0p/XOkcxt8jYmIARm9IOV71qniSpVAhW66ItIaeBSYAqA\nMSbLGHPkjNn6ATOM5VcgUkQqFTpbb5dzEvPlvRyUKBZE307H2lF2Z+TTyoYFMnNIa65qXoX7V5Xm\nsajXyK3VDRaMgi/utk4pVaqEcaWlXwNIBt4TkTUiMllEzhyusQqwK9/z3c7X/kFEhopIgogkJCcn\nu5CSh1v6GpK8iYdPDua2Lo11bHQPEOTvxyvXNuOBy+vywdo0bjh2Hxlt74c178O03nB0n90pKuVW\nrhR9f6AF8LYxpjlwAijUUUljzCRjTCtjTKtTl1+XOMmbMT+/xJLAS9lepgO9GvvuDo+nERHu7VaH\n1wfFsWbXUa7Y0JkDPSfBgd9hUmfYtcruFD2Gn58fcXFxNG7cmGuuuYb09PSLev+QIUP4/fffL3j+\nadOmcffdd593vp49exIZGUmfPn3+8frgwYOpUaMGcXFxxMXF/etq4ovx5JNP8t133xXqvUeOHGHC\nhAmFju1OrhT93cBuY8wK5/M5WBuB/PYAVfM9j3G+5lvy8uDL/yPHL4QHjl7PXZfWwk9HrPQ4/eKq\n8MGdbUjLyKbHojKs6znbGtZh2hWweqbd6XmEU8MwbNiwgcDAQCZOnHjB7z01tPKFjqV/MUaPHs3M\nmQX/jcaNG0dSUhJJSUn/uNjrYj377LNcdtllhXqvJxX9Qp+nb4zZLyK7RKSeMWYz0A04cxP+BXC3\niHwEtAHSjDG+t7+8ejrs/IX3yjyImPIMaPGvHi7lIVrFlmXuiPbcNm0VAz9L49UrP6DP5v9YffwH\nfoMeL1z4PQ2K0tePwP717l1mxSbQa+wFz+6NQysXZNq0acybN48TJ06wdetWRo0aRVZWFjNnziQo\nKIgFCxZQtmxZBg8eTJ8+fbj66quJjY3l1ltv5csvvyQ7O5vZs2dTv359nn76acLDwxk1ahQAjRs3\nZv78+TzyyCNs27aNuLg4Lr/8csaNG8e4ceP45JNPOHnyJFdddRXPPPMMJ06c4Nprr2X37t3k5uby\nxBNPcN111xVqvc7G1W/vPcAsEVkHxAEviMgwERnmnL4A+BP4A3gXGOFiPO9zbD98+xTHK7XnhX0t\nuKNjDb1BiYerXi6MucM70Kp6We6e9xevVhyDaTMMVrwN8//P2nPzcaeGVm7SpMk/hlZOSkrCz8+P\nWbNmAX8PQbx27dp/jDl/amjlH374gaSkJFatWsW8efPYt28fTz31FMuWLWPp0qX/6Ar64osvePLJ\nJy8618cee4ymTZty//33c/JkwQfnN2zYwGeffcaqVat47LHHCA0NZc2aNbRr144ZM2YU+J6oqChW\nr17N8OHDeemll86Zw9ixY6lVqxZJSUmMGzeORYsWsXXrVlauXElSUhKJiYksWbKEhQsXUrlyZdau\nXcuGDRvo2bPnRa/v+bh0Ra4xJgk487LfifmmG2CkKzG83jePQU4mLwUOJyI4gBvbVLM7I3UBSocG\nMP321jw+bz2v/7iD7U2v45UOYfgve9m69/CVb9jb4r+IFrk7edvQymPGjKFixYpkZWUxdOhQXnzx\nxQI3HF26dCEiIoKIiAhKly7NlVdeCUCTJk1O782cKf8wzZ999tlF5bVo0SIWLVp0+l4Bx48fZ+vW\nrcTHx/Pggw/y8MMP06dPH+Lj4y9quRdCh2EoSslbYMOnHGl5N9N/8WN4p+ruGwJZFblAfwcvDmxK\njahwXly4iT3VujGzrSH011eswt/3TeteBD7E24ZWrlTJOmEiKCiI22677awt8vxdSA6H4/Rzh8Nx\n1iGYCxqm+VzDMednjOHRRx/lrrvu+te01atXs2DBAh5//HG6detWqL2bc/GAzskSbNnr4B/MWxmX\nE+Dn4LYONezOSF0kEWF451pMuLEFG/Yepce6eFIueQCSZsHnI627jvm4bt26MWfOHA4ePAhAamoq\nf/311znf07p1a3766ScOHTpEbm4uH374IZ06daJNmzb89NNPpKSknO4rd8W+fdYhRGMM8+bNo3Hj\nxi4t73xiY2NZvXo1YBXv7du3AxAREcGxY8dOz9ejRw+mTp3K8ePWkOF79uzh4MGD7N27l9DQUG66\n6SZGjx59elnupC39onJkF6z7iBPNbmPGynSuvSSG6Ai9vaC3uqJJJSqVDubOGQl0XtWWr5r+H9XW\nvma1+Pu/7XMt/vw8eWjlG2+8keTkZIwxxMXFXdTZRoUxcOBAZsyYQaNGjWjTpg1169YFoFy5cnTo\n0IHGjRvTq1cvxo0bx8aNG2nXzhq5Jjw8nPfff58//viD0aNH43A4CAgI4O233T+IgQ6tXFQWPAQJ\nU3mryRxeWXmCxaO6UK1cqN1ZKRftSk3njumr+DP5BHMb/0KTLW9Ck2us+xkX8Q1adGhldYorQytr\n905ROJ4Mq6eT1eha3l6TSZ+mlbXglxBVy4YyZ3h72tUqx5Xr2rG46ghYPxvmDoXcc99+TylPoEW/\nKPw6AXJO8knwQE5k5TKsUy27M1JuVCo4gPcGX8KNbaoxeGtHPis3FDZ8Cp+P0NM5lcfTPn13yzgC\nqyaT26Afr67Oo3O9aBpWLmV3VsrN/P0cPNe/MTWiwnhwAaSXyeSmdTMgpAz0HGvdYL4IGGN0zCYf\n52qXvBZ9d1s1GU4e5esyN5ByIktb+SWYiDAkvibVyoZy30cO/AMPM2jFRAgtB50ecnu84OBgUlJS\nKFeunBZ+H2WMISUlheDg4EIvQ4u+O2Wlw68TyKt9OS8mBdC8WjhtapS1OytVxLo3qsgnd7VnyDQ/\nQsxR+v34vNXiP3XzeDeJiYlh9+7dlOiRaNV5BQcHExMTU+j3a9F3p9UzID2FZZVuZdeGDJ7o3VBb\nZD6iSUxp5t0Tz5D3AglNPc5lC0YjIWWgydVuixEQEECNGnqth3KNHsh1l5ws+OUNTPX2PL++NLXL\nh3NZgwp2Z6WKUaXSIXw8vCNzYp9lRV59cj8bSu7mRXanpdQ/aNF3l3Ufw9E9rKsxhE37jzGsUy0c\nOnyyzwkP8mfC4A782Px1NuZWJeejm8jY9ovdaSl1mhZ9d8jLhWWvQaVmPL+pEpVLB9O3WWW7s1I2\n8XMIj17Vht+6TGVvbhly37+alG2JdqelFKBF3z02fgEpf7Ct/lBW7jjMkPiaBPrrR+vrruvSkn39\nPuJ4XhBm5gB2b7vwO0YpVVS0MrnKGPj5ZShXh7Hb6xAZGsCg1lXP/z7lE9q3bM7xa2fjTw7m/QEc\n3LvT7pSUj9Oi76o/vof969nfdDjfbjrE4PaxhAbqSVHqb7UbteLQlTMol5dK2uR+pKam2J2S8mFa\n9F218h0Ir8DL+5sSEuDHre1i7c5IeaDaLbux87K3ic39i10T+nPs+LHzv0mpIqBF3xWp22Hrtxxt\neCNz1yZzfetqlAkLtDsr5aHqxw9kS7uxNMtZx29vDSLzZJbdKSkfpEXfFQlTQRy8m94JgCHxeuGM\nOrdGPYeyvvHDtM1cyvI3byM7R2/CooqXFv3Cys6ANTPJqtOLd9dm0L95FSpHhtidlfICTa7+Dxtq\n3EGX4/P57u37yc3zrHtaqJJNi35h/TYXMg7zeUBvMrPzGNappt0ZKS/S+JaX2VixH71SpvPVlGdc\nHjlRqQulRb+wVr5LXlQ9nv+9HJc3rEDt8hF2Z6S8iQgN7pzK1jLx9Nn9Gp/PesvujJSP0KJfGHsS\nYe9qfi13FUcychjeWYdPVoXg50/t4Z/wV3hTrtj6FF9+NtPujJQP0KJfGCsnYwLDeWJ7Y9rUKEuL\namXszkh5KQkMpfrIzzkYHEu3tQ/y9cIv7E5JlXAuFX0R2SEi60UkSUT+dTdzEeksImnO6Uki8qQr\n8TxCeips+JRtlXqz7ahDW/nKZY7QMlQY8RXHAsrRdvlwvv9psd0pqRLMHS39LsaYuHPchf1n5/Q4\nY8yzbohnrzUzIfckY5I70KBSKTrVjbY7I1UCBJSuRORdX2H8Amn0w2CWrtIB2lTR0O6di5GXC6um\nkBp1Cd+nRjG8cy29SYpym6DomgQNnke4I4sq829g1YZNdqekSiBXi74BFolIoogMPcs87URkrYh8\nLSKNCppBRIaKSIKIJHj0reD++A6O/MXkk92oVjaUKxpXtDsjVcKEVWtG3qCPqSSphM2+jvXbdtmd\nkiphXC36HY0xLYBewEgRufSM6auB6saYZsCbwLyCFmKMmWSMaWWMaRUd7cHdJasmkxVSnknJDRl6\naU38/XRHSblfqXrxpPd/j7qyi5Mzr2XrHg9uCCmv41LVMsbscf48CMwFWp8x/agx5rjz9wVAgIhE\nuRLTNs5xdr4K6EFkeBhXtyz8jYmVOp+ycX040v0NWvE7eydfz87ko3anpEqIQhd9EQkTkYhTvwPd\ngQ1nzFNRnJ3eItLaGc87x5VNmIIRB2MPtuH2jrEEB/jZnZEq4aLa38SBjv+lk1nFbxNv5kBaut0p\nqRLAlZZ+BWCpiKwFVgJfGWMWisgwERnmnOdqYINznjeAQcYbrzfPzoA177MmrCPpQeW5qW11uzNS\nPqLCZfeyr8UD9MpdzPK3hnD4+Em7U1JertB3+zDG/Ak0K+D1ifl+fwvw/uvLN3wGGYcZlxXPjfHV\nKRUcYHdGyodUuvJJ9mSm0f/3KXw8/h6uuG88EfodVIWkRyIvxKp3ORAUS6KjEbd3iLU7G+VrRKhy\nzcvsrnkt12V8zOcTHiYzW4dkVoWjRf989qyGvWt4J70zV7esSvlSwXZnpHyRCDE3TWR3lV7cdHQK\nn0x8luzcPLuzUl5Ii/75JL5HliOYT3M6MjReh09WNnL4EXP7THZHx3PTodf5cPLL5OlY/OoiadE/\nl8w0zPo5fJHbno5NahEbFWZ3RsrX+QUQM3Q2eyNbcMPeF/hg5kQdi19dFC3657LuEyQ7nelZXRne\nSQdWUx4iIIQqw+dxMLw+1/z5BB9/8r7dGSkvokX/bIwhb9UUfqcWkbVb07hKabszUuo0CS5FpRHz\nORxSlSt/f5C5n39qd0rKS2jRP5tdK3Ekb2R6dlcdPll5JAkrR/SIrzkeWJ7LVo9k4Tdf2p2S8gJa\n9M8ib9UUjhPKjko9aVeznN3pKFUgv1IVKTN8IekBZWj/y1AWL15kd0rKw2nRL0h6Kua3uXyW04Hb\nOjfW4ZOVRwssG0PpuxaS6R9Osx9vY+XyxXanpDyYFv0CmKRZ+OVlsaRUH7o3rGB3OkqdV3B0dULv\nXECOXzC1F97E2tXL7U5JeSgt+mcyhozlU0jIq0v3rt1wOLSVr7xDeMU6BNw+nzxHAFU+v47NG/Tu\nW+rftOifacfPhB7bzvyAnvSPq2J3NkpdlMiYBpibP8chUGbOQHZsWWd3SsrDaNE/w+ElEzliwqge\nfz2B/vrxKO8TXbMpGdfPJZAcgj+4ir3b9baL6m9a1fI7fpBS2xfypXTm2nZ17c5GqUKrUq8lh6+e\nTQgZOGb04dCuzXanpDyEFv18Un6egh+55DQfTFhQoUedVsoj1Gjcjr19PyYkL528qVeQtlsLv9Ki\n/7e8PGT1NH41jeh3WWe7s1HKLRq0iGd77w8JyMske+oVnNi3xe6UlM206DulrPuastn72VVzEGXD\nAu1ORym3iWvdiY3dZ+HIzeTkuz05eUALvy/Tou90aPFEDplStOt9i92pKOV27Tt0ZnWXmZjcLDIm\n9ST7gHb1+Cot+sCR/Tuoffhn1kZdSUxUpN3pKFUkLuvclaXt3yMnJ5uMST3JO6iF3xdp0Qd+m/8m\nAtToMcLuVJQqUv16XM43rSZzMieXE5N6Yg5utDslVcx8vuj/vusgdXfNZlN4G2rWbWx3OkoVuRv6\ndOezpu+QkZ1H+rtXwEE9j9+X+HTRz8rJ46sPJxAtaVS74gG701GqWIgIQwf0ZEa98RzPyiP93V5a\n+H2ITxf9t37YSo/j8zheqhbhDbvbnY5SxUZEuH9QbybGvs7xrDwyJ2vh9xUuFX0R2SEi60UkSUQS\nCpguIvKGiPwhIutEpIUr8dxpw540fv1pAU0d2wmPHwk6fLLyMX4O4dGb+/Jy5Vc4ejKPzCm9tfD7\nAHe09LsYY+KMMa0KmNYLqON8DAXedkM8l2Xl5DFq9lqGBn6DCSoNzQbZnZJStgj0d/D0bf15Pvp/\nHM3M4eRULfwlXVF37/QDZhjLr0CkiFQq4pjn9eYPW0nbv4OurERa3gKBYXanpJRtQgL9+O+QATwR\nOYa0jByypvaGZD2ds6RytegbYJGIJIrI0AKmVwF25Xu+2/naP4jIUBFJEJGE5ORkF1M6t3W7jzBh\n8Taer7IcBwZaF5S2Ur6lVHAAL9w5kIfCnudoRjbZWvhLLFeLfkdjTAusbpyRInJpYRZijJlkjGll\njGkVHR3tYkpndzInl1Gz11IlzND5xAKo3xsiqxVZPKW8SbnwIMbcNZB7gp4jLSObHC38JZJLRd8Y\ns8f58yAwF2h9xix7gKr5nsc4X7PFG99vZcuB40xqtg1H5hFoM9yuVJTySJVKh/DC0IEM83uGtIxs\ncqddCanb7U5LuVGhi76IhIlIxKnfge7AhjNm+wK4xXkWT1sgzRizr9DZumDtriO8vXgb17SoQv2/\nPoCKTaB6eztSUcqj1YgK479DBnAnj3PixAlyp/eDY/vtTku5iSst/QrAUhFZC6wEvjLGLBSRYSIy\nzDnPAuBP4A/gXcCWcQ4ys61unfIRwTzVJAWSN1qtfD1NU6kCNahUisduu5o7cx8hK+0AudP7Q3qq\n3WkpNyj0nUKMMX8CzQp4fWK+3w0wsrAx3OX177ey9eBx3rvtEsIT74XQKGg80O60lPJoLauX4e5b\nBjFsegaTD70I71+D362fQ1C43akpF5T4K3LX7DzMOz9t47pWVekSfRy2LIRWt0NAsN2pKeXx4utE\nc/2gW7gn+x5k72pyP7oRck7anZZyQYku+qe6dSqUCuaxPg1gxSRw+MMld9idmlJeo2fjilw2YAgP\nZd+J3/bF5M0ZArk5dqelCqlEF/1Xv9vCtuQTjB3YlFJkwJr3odFVEFHR7tSU8ipXt4yh0RXDeTb7\nZhybvsB8eR8YY3daqhBKbNFfvfMw7y75k+tbV6VT3WhI+gCyjkHbYed/s1LqX27rUIPIrvfxes5V\nSNL7mEWPa+H3QiWy6J/q1qlUOoT/XNEA8vJg5TsQ0xqqtLQ7PaW81j1da3OszWim5XRHlr8Fy8fb\nnZK6SCWy6L/y7Rb+TD7B2IFNiAgOgI2fQ+qf2spXykUiwmN9GrKp2WMsyG1N3qInYMsiu9NSF6HE\nFf3Ev1J59+c/uaFNNeLrRFuMooIlAAAd/UlEQVQHnH54HqLrQ8P+dqenlNcTEZ4f2Ixv6z7N73nV\nyP5ksI7M6UVKVNHPzM5l9Ox1VD7VrQOw7mNI2QpdHgOHn70JKlVC+DmEF69vx+SY5zmS7c+J6VfD\niRS701IXoEQV/Ze+2cyfh07wv6ubEh7kb51PvHgsVIqDBlfanZ5SJUqgv4MXBvfklXJP4398P0em\nD4KcLLvTUudRYor+qh2pTFm2nZvaVqND7SjrxdUzIG0ndHtCh1xQqgiEBvrzyJ0383rYvUQeXMnB\nT+7RM3o8XIko+hlZuYyevZYqkSE82svZrZOVDkvGQfUOUKubvQkqVYKVDgng9hEPMyvgaspv+Yi9\ni16zOyV1DiWi6I/7ZjM7UtL539VNCQtyDie0chIcPwBdtZWvVFGLCg+iy4g3WOJoTYXlz7I3cb7d\nKamz8Pqiv3J7Ku/9sp1b2lWnfS1nt05mGix7DWpfDtXb2ZugUj6icpkwqt7xPtuoSsSXd3Lgz3V2\np6QK4NVFPz0rh9Fz1hJTJoSHe9b/e8Ly8ZBxGLo+bl9ySvmgGlUqINd/RBYBZM+8lrQjOhyzp/Hq\nov+/hZv5KyWd/w1s9ne3zokUq+g37AeV4+xNUCkfVKdeQw70nESlvP1sevd28nLz7E5J5eO1Rf/X\nP1OY9ssOBrePpV2tcn9PWPoKZKdb5+UrpWzRsG1P1tYeQZsTP/LjJ3pg15N4ZdFPz8rhoTnrqF4u\nlId61vt7wtG9sGoyNB0E0fXOvgClVJFrfsOzbA5tQbtNY1m1arnd6Sgnryz6L369iZ2p6fxvYFNC\nA/Pd/GvJOMjLhc4P25ecUgoA8fOn2h0zyHIEU/qroexO1it2PYHXFf1fth1i+vK/uK1DLG1q5uvW\nSd1uXYzV8lYoE2tbfkqpv4WUq0pG77eoy07WTbmbzOxcu1PyeV5V9E+ctLp1YsuF8lCP+v+c+MN/\nrbtixY+yJzmlVIEqterLjrq3c0XmAua8r0Mx282riv7Yrzex50gG465pRkhgvsHT1rwPGz6FDv8H\npSrZl6BSqkCx177IvrCG9N0xhvk//Wp3Oj7Na4r+L38cYuavf3F7hxpcElv27wn71sFXD0KNS6HT\nQ/YlqJQ6O/9Ayt/+Af4OqPzD3WzYdcjujHyWVxT94ydzGD1nHTWiwhjVPd9ZORlH4JNbIKQMDJyq\nQycr5cH8ytUgt/drtJCtrJ7+EIdP6IicdvCKoj9mwUb2pmUw7uqmf3frGAOfj4S0XXDNNAiPtjVH\npdT5RbS6jkP1ruem7M+YNG0KuXk6Imdxc6noi4ifiKwRkX+NriQig0UkWUSSnI8hhYmxdOshZq3Y\nyZCONWiVv1vnlzdg03y4/Fmo1rbwK6GUKlZRA1/haERNbjs4lncWrrI7HZ/jakv/PmDjOaZ/bIyJ\ncz4mX+zCj2Vm8/Cn66gZHcaD+bt1diyD756xhlpoO+Lis1ZK2ScwlNI3TqOs4wS1lz/C97/vtzsj\nn1Looi8iMUBv4KKL+YV6YcEm9qVlMO7qZgQHOLt1jh2AObdB2RrQ9y0dNlkpLySVmmK6PUV3v0SW\nfvIyf6WcsDsln+FKS/814CHgXKMpDRSRdSIyR0Sqnm0mERkqIgkikpCcnAzAki3JfLhyJ3fG16Rl\n9TLWjLk5MOd2yDwK186A4FIupK+UslNA+5FkVOvEQ0zjuemfk5GlF24Vh0IVfRHpAxw0xiSeY7Yv\ngVhjTFPgW2D62WY0xkwyxrQyxrSKjo7maGY2j3y6jlrRYdx/ed2/Z/zhv/DXUrjyNajQqDCpK6U8\nhcNByDWT8AsM474jL/Lk3NUYvdVikStsS78D0FdEdgAfAV1F5P38MxhjUowxJ51PJwMtL3ThL3y1\nkf1HM3npGme3zpFdsPBR68YoLW+DZoMKmbZSyqNEVCRwwHgaO3ZQa/3rzFqx0+6MSrxCFX1jzKPG\nmBhjTCwwCPjBGHNT/nlEJP+lsX059wHf045l5vDRql0MvbQWzf13wJw74PVmsOIdaHYD9BxbmJSV\nUp6qfm9My9sY6j+fb+Z/zJqdh+3OqETzP/8sF05EngUSjDFfAPeKSF8gB0gFBl/IMvYcTue+MhsZ\nvX8CrFgKgRHQdji0GQaRZz0soJTyYtLjBfK2/8zLhydy8/t1mXVvL6LCg+xOq0QST+tDa1I5xKwf\nGgilYqDtMGhxCwSXtjstpVRR25tE3uTLWJTTnOlVnmXmkDb4+3nF9aMeQUQSjTGtzjefx32iDocD\nBkyG+5Kg/T1a8JXyFZXjcHR7gp6OlcTs/IyXFm2xO6MSyeOKvn+F+tD0GvALsDsVpVRxa3cP1LiU\n54JmsmjJzyzcoBduuZvHFX3Ri62U8l0OB1z1DoHBobwX+iZPzF7Bn8nH7c6qRPG4oq+U8nGlKiMD\nJ1MtdydPOSYzbGYCJ07m2J1ViaFFXynleWp1Rbr8hz5mCa1SvuDhT9fphVtuokVfKeWZ4kdB7cv4\nb+AMdqxfxnvLdtidUYmgRV8p5ZkcDhjwLo6ICkwLe4u3Fqxi5fZUu7Pyelr0lVKeK7Qscu0MyuWl\n8FbIJO6elcDBo5l2Z+XVtOgrpTxbTEuk5xja565iUNanjPxgNdm55xrcV52LFn2llOe7ZAg0Hsj9\njk/w37mUMQs22Z2R19Kir5TyfCJw5RtIVG3eDZ3A/GWr+XLtXruz8kpa9JVS3iEoHK6dSZicZGbE\nmzz96Uq2HDhmd1ZeR4u+Usp7lK+PDJhE3ZwtTPB7mXtmLOdYZrbdWXkVLfpKKe/S4Eqk33jamHWM\nOjaWhz5J1Au3LoIWfaWU94m7Aa54icsdifTY+gyTftpqd0ZeQ4u+Uso7tb4T0+1p+vv9QqnvH+KX\nrcl2Z+QVtOgrpbyWxN9PVvv7ud7vR/784H72HUm3OyWPp0VfKeXVAi9/iiNN7uAm8yWLJ43iZE6u\n3Sl5NC36SinvJkLkVS+xu/oArk+fxQ9Tn7I7I4+mRV8p5f0cDmJunczGst3otfdNvnljJGt2HLQ7\nK4+kRV8pVTI4/Kgz7APWRfWmR+r7yNSeDHv9E+au2a1dPvlo0VdKlRj+gcE0vfsDMvpPpUFgMq8e\nvpsVc16lw5jveeXbLRzQEToRT7uooVWrViYhIcHuNJRS3i5tD2becGT7T6wJac+QI7eQJqXp1aQS\ng9tXp0W1MiXqntwikmiMaXW++Vxq6YuIn4isEZH5BUwLEpGPReQPEVkhIrGuxFJKqYtSugpy8zzo\n/jzNsxJYUeZJnm20j8WbDzLw7eX0fWsZcxJ3k5ntW10/rnbv3AdsPMu0O4DDxpjawKvAiy7GUkqp\ni+NwQPu74c4f8A8rxw1bH2B18wW80iOKzOxcRs1eS/uxP/DSN5vZl5Zhd7bFotBFX0RigN7A5LPM\n0g+Y7vx9DtBNStK+lFLKe1RsAkMXQ9sRBCTNYMCSXiyq+h5f9AugZbVIxi/+g44v/sjIWatZuT21\nRI/lU+g+fRGZA4wBIoBRxpg+Z0zfAPQ0xux2Pt8GtDHGHCpgWUOBoQDVqlVr+ddffxUqJ6WUOq8j\nO2Hlu7B6OmSmQeUWpDS5g3dTmvJh4n7SMrJpWKkUg9vH0jeuMsEBfnZnfEEutE+/UEVfRPoAVxhj\nRohIZ1ws+vnpgVylVLE4eRzWfggrJkLKHxBRiewWt/OVX1cmrk5n0/5jlAkNYFDratzUtjpVIkPs\nzvicirrojwFuBnKAYKAU8Jkx5qZ883wDPG2MWS4i/sB+INqcJ6AWfaVUscrLg23fw68TYNsPAJiK\nTdkTfSmzUuvz7p+R5OGge8OK3No+lrY1y3rkWT9FWvTPCNSZglv6I4EmxphhIjIIGGCMufZ8y9Oi\nr5SyTfIW2DQfti6CXSvA5JEbUpaNoa2ZmVqfrzMaUrliJW5tH0v/uCqEBHpO148tRV9EngUSjDFf\niEgwMBNoDqQCg4wxf55veVr0lVIeIT3VavlvXQRbv4WMVPJwsMmvLotONmS1fwsaXtKZG9vVomrZ\nULuzLb6i725a9JVSHicvF/YkwtZFmG0/wJ7VCIajJpTleY04WKEDDTr2p2WzONu6frToK6VUUUlP\nhe0/kb7xW3K2fEeprAMA7PSrRtn2gwlvfRNEVCjWlLToK6VUcTCGkwc2s2HJXPhtLi1lM0b8kDrd\nofmNUKcH+AcWeRpa9JVSqpht2JPGM9Pm0T3re24JXU5QxkEILQdNB1kbgAqNiiy2Fn2llLLB/rRM\n7pi+ii37DjOxXRrdMr+FTQsgLxvqXQG9XoTIam6PWywDrimllPqniqWD+eSudnSqX4k7finLMyEP\nk/vAJuj6BPy5GMa3gaWvQk6WLflp0VdKKTcLC/LnnZtbcUfHGry3bAdD52znRJv/g5EroVZX+O5p\neCcediwt9ty06CulVBHwcwhP9GnIf/s35sfNB7lm4nL2SRQMmgXXfwzZ6TCtN8wdBseTiy0vLfpK\nKVWEbm5bnamDL2Fnajr9xy9jw540qNcTRqyAjg/A+jnwVitInA7FcIxVi75SShWxzvXKM2d4O/wd\nDq6ZuJxvfz8AgaFw2VMwfJk19POX91rdPkVc+LXoK6VUMahfsRRzR7anboVwhs5MYPLPf1rj9kfX\ng1u+gFa3w7LXYOGjRVr4tegrpVQxKR8RzEdD29GzUUWe+2ojT3y+gZzcPOsOX71fgTbDYcXb8NUD\n1uifRcC/SJaqlFKqQCGBfoy/oQX/+2YzE3/axs7UDMbf0JyI4ADoOQb8g6wWf24WXPkGONw7kqe2\n9JVSqpg5HMIjveozdkATfvnjEFe/vZzdh9NBBC57Gjo9Amvet87syc1xb2y3Lk0ppdQFG9S6GtNu\na83etAz6j/+FpF1HrMLf5VHo9iSs/wQ+vR1ys90WU4u+UkrZqGOdKOaOaE9IoINBk5bz9fp91oT4\nB6HHC/D75/DJLZBz0i3xtOgrpZTNapePYO6IDjSsVIrhs1Yz8adt1pk97UbCFS/B5gVWV48bzurR\noq+UUh4gKjyID+5sy5XNKjP260088ul6snPzoPWd1rg9v30G62e7HEeLvlJKeYjgAD9evy6Oe7rW\n5uOEXdw6dSVp6dnQ8X6o2ga+GgVpe1yKoUVfKaU8iMMhPNi9Hi9d04xVO1IZ8PYydh4+CVdNhLwc\n+HyES+fwa9FXSikPdHXLGGbe0YZDx7PoP2EZicciocdz1vDMCVMKvVwt+kop5aHa1izH3BHtKRXs\nz/XvruAL/x5Q+zJY9AQc2lqoZWrRV0opD1YzOpy5IzoQFxPJvR8lMbXsg5iAYJh7V6Eu3NKir5RS\nHq5MWCAzh7RmQPMqPLvkMDPK3gt7Eq07cF0kHXtHKaW8QJC/Hy9f24zYqDCe+hZqRXamw09jkTqX\nQ+W4C15OoVv6IhIsIitFZK2I/CYizxQwz2ARSRaRJOdjSGHjKaWUrxMR7u1Wh9cHxXH/sZs4ZEqR\nNedOyM684GW40r1zEuhqjGkGxAE9RaRtAfN9bIyJcz4muxBPKaUU0C+uCm/f2Y2nZQSBqVvYN/c/\nF/zeQhd9YznufBrgfBT9vb6UUkrRKrYsD40cwecBvajw29QLfp9LB3JFxE9EkoCDwLfGmBUFzDZQ\nRNaJyBwRqXqW5QwVkQQRSUhOLr4bBCullDerXi6MziMnklD2igt+jxg3DOAjIpHAXOAeY8yGfK+X\nA44bY06KyF3AdcaYrudaVqtWrUxCQoLLOSmllC8RkURjTKvzzeeWUzaNMUeAH4GeZ7yeYow5NR7o\nZKClO+IppZQqHFfO3ol2tvARkRDgcmDTGfNUyve0L7CxsPGUUkq5zpXz9CsB00XED2vj8YkxZr6I\nPAskGGO+AO4Vkb5ADpAKDHY1YaWUUoXnlj59d9I+faWUunjF2qevlFLKO2jRV0opH6JFXymlfIjH\n9emLSDLwVyHfHgUccmM6nh7Xzti6zr4R29fi2hnb1bjVjTHR55vJ44q+K0Qk4UIOZJSUuHbG1nX2\njdi+FtfO2MUVV7t3lFLKh2jRV0opH1LSiv4kH4trZ2xdZ9+I7Wtx7YxdLHFLVJ++UkqpcytpLX2l\nlFLnoEVfKaV8iFcVfRGJEZHSNsU+7/mvJSyuK4PxuRq7nE1xy9sR1xk7xMbYttQBEQmyKW6YHXGd\nsaOdP8WuHLyi6ItIqIi8DHyDNbLnzc7Xi/yDc8Z+FZgvIo+ISFfn635FHDdYRN4GfhSRZ/PFLdK/\nmYiEi8g7wJDiLkTO2K8CX4nIcyLSpRjjvgwsEJFXRKSX8/Xi+H6Fi8hbwGQR6VlcjRoRiRCRF0Wk\nrDEmrzgLf76/83gRuUJEShVj3NeAqSIysLg38iIyDFgnIk2MMcauja1XFH3gCSDaGNMImAHcCdZ9\neosh9n+ASKwbxKwHZopIkDEmt4jj3g6UBzoB27G+qMHGmLyiCigiZYBXsNa1BdC4qGIVELsu1t3X\ncrHWPRnrsy+OuJ9gDTN+FbADGArF9v16DQgEPgOuBx4p6oAi0gxYCNyPdXOjYiMi3YHlQCawFBgC\n9CqGuH2AZUA28CFwF8V0U6d8jYdg4DDwGEBR/i+fi0cXfRHxF5FgIASY53y5ArDw1A1aimpr6Ywd\n6ow33hhz2BjzFdaXdVxRxBaRwDNeWu68+9h7WP8oLzjnc2sLNF/ck8BbQFMgHYgv6q6WfLFPAJOM\nMaOMMb8DC4B9IhJTxHFTgP8zxtxnjNkFlMLauwpyzuf279epv5+IRAGVgQeMMZ9ibXAricid7o6Z\nPy5wFHjeGBMIdBSRDs7WfpHtveaLfQx4yRjzqDFmGrAZqHfGPO6Me+rvtx24wxgz2hgzD2u4g6Pu\njncWDudnWwYYDpQRkRuc+RVpj0GByRR3wPMRkXoi8j8AY0yOMSYTa+t4hYgsB0YDZYGVzt2kPHd9\nWQqInY61db5WREqLSDWs1slVIhLrrtgiUkdEpgIvi0gb58tBWGNxnDLaGbeWc9ewKOJmGWPWGWPS\nsDayzYA4V+NcYOy9wLx86xUK1DfG7C7iuMeMMVucu/5PA8OAhs5cqrj5+1VfRCZi3VyolDHmEJCH\nc88V685zc4E+IlLWHTHPEnc71u1NAcYAbwMUxd5rAbGXAx+KSIBzlk1AOWd8t+1ZFRD3N2NMglh3\n/PsaaOucdq2IhLsrbv7YInKfiEQYY3Kdn204VqNiAjBMRGKxGrTFyqOKvoj0xtrNHSUi+XdzXwCe\nAnYDTY0xo4CpwEvgni/LOWI/hnWXsHeAr7F2h2dj7Za6HFtEhgOfA4nAAeAeEWnujNFbRBo54+x2\nzvefIoo7EuhwaroxZjGwC+ji7tb2WWJfaozJzrdeZbFagUUdt51z8gngS2NMjDFmmDP2RHDb96sG\n8D6wDWtj+raIxGHtNfYQkTLO+0mvw2qVtnA15lnijheRNsaYDABjzKtAoLO/2a0KiP2WM3aWMSbb\nOVtHYEsRxm16Kq5zcirwgTGmJjAFaA/0L6LYpz7vU9+xQOBHY8znWHt4SUDD4u7b96iij/WPeCNQ\nF3hYRCKcr+fw9wh06c7XJgI54r4j8QXGNsbsBG7D2uh0Msb8DOwDNoBbdkkPAPcZY8Zjtbr8gVrO\nIv+lM5cKznkXUvgRSM8XNwir0OY/c2cWEA00FpF7RaRpMcQ+tbvbEPjN+doNYvW9uztu4Km4xpKY\nb95PgV3ivrOY6gOHjDHjsPqTN2MVm0xgLfCoM4/tQCzWRqgo4m7FakzE5pvnPuBxABG5NN/3zd2x\n/8gf29naL4Nzr0NE2ojzvttFFLeWs9U9E8AYswjreN0xN8Q8V+w+Yh003g98KiLrsfZwdgOJxd23\n71FF3xiTAGwyxvyBVeBO7XYarAJwKVZLeADwEbDKGOOWf46zxRYRf2NMDrDFGHPIudXui3Wg0R2t\nwC+BxSIS6NwFPIh1ABfgSawDm0+JyBDgRayWijucGffAqbjO9cUYsxFrI/QRcCuQVQyxT3UxdASi\nRWQu1sY4u8AluRY3/2d9moi0wtq7XH/qs3CDDUCmiNR3tnK/xurCqot1+X1/ERkgIm2xNrTu6t8+\nM+4CZ9z4UzMYY74BjopIFvB/WF1ORRn7Uuf0UxvUliKyCKtxVZRx2+efydmIqYF7h1EuKHYQ0AdI\nw9rA3GGM6YPVVTzajbEvjDGm2B9YW3dHvudy5u9YfV9HgEvyTeuEdSbPYuC6Yo4djLXl3gLc6M64\nZ8z3PdYexannFbE2Mh8Vc1wBegB7gBvc/VmfJ3YwVus3Ebi2GONGAW8WNq5zGaXPeH7qO1Uba+9i\nSL5p9wNPOX/vD4zFOkPspiKO+3/AE87fw7C6MLe78D9V2NgdsTYw3wODiiMuVkO3BtYxq18LE7cQ\nsR/AKu5nvie8MLFdfRR/QOsLtgl4GRh+lnn8nD+fwOoDA+s0wgAbY/sBlYswrgPr2MFXzlgOoI0r\n61zIuOKM6+fKl9KV2M5pV9kUt70L6/wMVvfJC6eKCeCfb/oQrONQ7ZzP2wIbXPlOuxB3Xb7pHYs5\n9nrn72HAg8W9zlgHTwfb+Hn7ufo3d+n7UqzB4Fqs3Z3SWFvbzUB8AfPlb33nYO0SvYLVAiyw1VbE\nsV8FgooyrnPe+lh9yTcAq7HO2Q4ozDq7GPdRXNvYuBL7scL+U7gY9z/5/3ELEbsP8ANQBauRsB/r\n2Aw49zqAalgtvgVYZ3IMwjqYGGpTXJdami7GDvO2uHb+nd35KPoA+b5YwL1Y5ySfev4BsASoVMD7\nooB3gTVAB2+K7ULcYVi7ux9jnc3iFXF9eJ0j8/1+FTAu3/MXgNkFvEewztiZh9X/29pb4uo6F3/s\nongU3YKtc2/HY7Wmrsc6kHITMA3rIFIQ8DqQgHNXnn/2w/pT+GJvS2w3xK0LDPWWuD68zmWdsb8G\n7sA6GNzXGTvwVCyskw96nvpO5Xu/YF1h7hVxdZ2LP3ZRPorqatbLsU7DOoB12l83YDBWy2sb1gHR\nX7EOWr2F1erC5Dt1yVgXRy3zlthuirvFGHNRN1KwK66dsW1e5x5YZ3cdxmrJdQMuxyoMTYCu+WK9\nifPsDJPvLCBjSfaGuLrOxR+7qBXVSIqHgJeNMdMBRKQ6EGOsqxufw2qlBRtjdjtPj0t0zucwrp+z\naldsX4trZ2w713k71kHA353LvAVINsZkizWI2CgR+c1YQzr8BNR3niLq6umudsW1M7YvrnORc7no\ni4gYY+3LnGKMWSMiW/L9kyXjvJzfGGNEJNX5D9oQq09soXPaRf1D2hXb1+LqOv8j9hbntGisazla\nYl3R2tAY85rz3O+nRGQV1gG81RdbCOyKq+tc/LFtYVzoG8LZr5Xv+dnOh/4f+U7NwuoHuxTYSL4D\nb94Q29fi6joXHBvrat4rnb83xjrwfzvW2VadgZnAvd4SV9e5+GPb9Sj8G+EerLEjngX6nvrA+OfB\nslPnQE8D2jpfuxzr0udACn8apC2xfS2urvPZYxfwnheBm/M9P+u8nhZX17n4Y9v5KNSBXBHphHWm\nxO3A78ATYg3PaoxzF1pE6hnrUvcArMvK40TkW6xzqcVYgy6d9JbYvhZX1/mcseue8Z5LsPYs9px6\nzVx8V5Itce2M7Yvr7BEuZgvB3xcfDASeyff6cP6+2q0K1pABn2IdUIvDOh96EYW8ytLO2L4WV9f5\ngmNHYQ2M9g3wC9Dfm+LqOhd/bE95XMiH5A+MAqrme+1qnEMU5HttLdbVje2Bx86Ydl8h/0C2xPa1\nuLrOLsUuzJgxtsTVdS7+2J74ON+H1QTrEvUDwIdnTNvEP/u2+gDzz5gnsNCJ2RTb1+LqOhc6dqGG\nqbArrq5z8cf21Mf5+vQPAW9gjVMSK9b9LU95AHhOrNsZgnXno80iEiAiDudpUK6cwmRXbF+La2ds\nb17nwg71bFdcO2P74jp7rFPDgZ59BpEQY0yGiNwFXG+M6Zxv2jSs+6p+B1wDpBlj3HZ/T7ti+1pc\nO2PrOus6l9R19lgXsZsUgnWRy735XisN9AbmAP8tqt0Ru2L7WlxdZ11nXeeiie1Jj4v90HoAK5y/\nN8E5mBAu9K16emxfi6vrrOus61yyHxd1nr6xbqt2WEROYt0dxuF8vcgvP7Yrtq/FtTO2rrOuc1HG\ntTu2x7iILaQDeA7rxtx3FueWya7YvhZX11nXuaTGtTu2Jz3OeyA3PxHpBfxgCnGlo6vsiu1rce2M\nretcvHSdfdNFFX2llFLerUhuoqKUUsozadFXSikfokVfKaV8iBZ9pZTyIVr0lVLKh2jRV0opH6JF\nXymlfMj/A19hNXiBHe8QAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "temp_series.plot(label=\"Period: 1 hour\")\n", "temp_series_freq_15min.plot(label=\"Period: 15 minutes\")\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Timezones\n", "By default datetimes are *naive*: they are not aware of timezones, so 2016-10-30 02:30 might mean October 30th 2016 at 2:30am in Paris or in New York. We can make datetimes timezone *aware* by calling the `tz_localize()` method:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 17:30:00-04:00 4.4\n", "2016-10-29 18:30:00-04:00 5.1\n", "2016-10-29 19:30:00-04:00 6.1\n", "2016-10-29 20:30:00-04:00 6.2\n", "2016-10-29 21:30:00-04:00 6.1\n", "2016-10-29 22:30:00-04:00 6.1\n", "2016-10-29 23:30:00-04:00 5.7\n", "2016-10-30 00:30:00-04:00 5.2\n", "2016-10-30 01:30:00-04:00 4.7\n", "2016-10-30 02:30:00-04:00 4.1\n", "2016-10-30 03:30:00-04:00 3.9\n", "2016-10-30 04:30:00-04:00 3.5\n", "Freq: H, dtype: float64" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_ny = temp_series.tz_localize(\"America/New_York\")\n", "temp_series_ny" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `-04:00` is now appended to all the datetimes. This means that these datetimes refer to [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) - 4 hours.\n", "\n", "We can convert these datetimes to Paris time like this:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 23:30:00+02:00 4.4\n", "2016-10-30 00:30:00+02:00 5.1\n", "2016-10-30 01:30:00+02:00 6.1\n", "2016-10-30 02:30:00+02:00 6.2\n", "2016-10-30 02:30:00+01:00 6.1\n", "2016-10-30 03:30:00+01:00 6.1\n", "2016-10-30 04:30:00+01:00 5.7\n", "2016-10-30 05:30:00+01:00 5.2\n", "2016-10-30 06:30:00+01:00 4.7\n", "2016-10-30 07:30:00+01:00 4.1\n", "2016-10-30 08:30:00+01:00 3.9\n", "2016-10-30 09:30:00+01:00 3.5\n", "Freq: H, dtype: float64" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_paris = temp_series_ny.tz_convert(\"Europe/Paris\")\n", "temp_series_paris" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may have noticed that the UTC offset changes from `+02:00` to `+01:00`: this is because France switches to winter time at 3am that particular night (time goes back to 2am). Notice that 2:30am occurs twice! Let's go back to a naive representation (if you log some data hourly using local time, without storing the timezone, you might get something like this):" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 23:30:00 4.4\n", "2016-10-30 00:30:00 5.1\n", "2016-10-30 01:30:00 6.1\n", "2016-10-30 02:30:00 6.2\n", "2016-10-30 02:30:00 6.1\n", "2016-10-30 03:30:00 6.1\n", "2016-10-30 04:30:00 5.7\n", "2016-10-30 05:30:00 5.2\n", "2016-10-30 06:30:00 4.7\n", "2016-10-30 07:30:00 4.1\n", "2016-10-30 08:30:00 3.9\n", "2016-10-30 09:30:00 3.5\n", "Freq: H, dtype: float64" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_paris_naive = temp_series_paris.tz_localize(None)\n", "temp_series_paris_naive" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now `02:30` is really ambiguous. If we try to localize these naive datetimes to the Paris timezone, we get an error:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "Cannot infer dst time from Timestamp('2016-10-30 02:30:00'), try using the 'ambiguous' argument\n" ] } ], "source": [ "try:\n", " temp_series_paris_naive.tz_localize(\"Europe/Paris\")\n", "except Exception as e:\n", " print(type(e))\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fortunately using the `ambiguous` argument we can tell pandas to infer the right DST (Daylight Saving Time) based on the order of the ambiguous timestamps:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-10-29 23:30:00+02:00 4.4\n", "2016-10-30 00:30:00+02:00 5.1\n", "2016-10-30 01:30:00+02:00 6.1\n", "2016-10-30 02:30:00+02:00 6.2\n", "2016-10-30 02:30:00+01:00 6.1\n", "2016-10-30 03:30:00+01:00 6.1\n", "2016-10-30 04:30:00+01:00 5.7\n", "2016-10-30 05:30:00+01:00 5.2\n", "2016-10-30 06:30:00+01:00 4.7\n", "2016-10-30 07:30:00+01:00 4.1\n", "2016-10-30 08:30:00+01:00 3.9\n", "2016-10-30 09:30:00+01:00 3.5\n", "Freq: H, dtype: float64" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp_series_paris_naive.tz_localize(\"Europe/Paris\", ambiguous=\"infer\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Periods\n", "The `pd.period_range()` function returns a `PeriodIndex` instead of a `DatetimeIndex`. For example, let's get all quarters in 2016 and 2017:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PeriodIndex(['2016Q1', '2016Q2', '2016Q3', '2016Q4', '2017Q1', '2017Q2',\n", " '2017Q3', '2017Q4'],\n", " dtype='period[Q-DEC]', freq='Q-DEC')" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarters = pd.period_range('2016Q1', periods=8, freq='Q')\n", "quarters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Adding a number `N` to a `PeriodIndex` shifts the periods by `N` times the `PeriodIndex`'s frequency:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PeriodIndex(['2016Q4', '2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1',\n", " '2018Q2', '2018Q3'],\n", " dtype='period[Q-DEC]', freq='Q-DEC')" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarters + 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `asfreq()` method lets us change the frequency of the `PeriodIndex`. All periods are lengthened or shortened accordingly. For example, let's convert all the quarterly periods to monthly periods (zooming in):" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PeriodIndex(['2016-03', '2016-06', '2016-09', '2016-12', '2017-03', '2017-06',\n", " '2017-09', '2017-12'],\n", " dtype='period[M]', freq='M')" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarters.asfreq(\"M\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the `asfreq` zooms on the end of each period. We can tell it to zoom on the start of each period instead:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PeriodIndex(['2016-01', '2016-04', '2016-07', '2016-10', '2017-01', '2017-04',\n", " '2017-07', '2017-10'],\n", " dtype='period[M]', freq='M')" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarters.asfreq(\"M\", how=\"start\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can zoom out:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PeriodIndex(['2016', '2016', '2016', '2016', '2017', '2017', '2017', '2017'], dtype='period[A-DEC]', freq='A-DEC')" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarters.asfreq(\"A\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course we can create a `Series` with a `PeriodIndex`:" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016Q1 300\n", "2016Q2 320\n", "2016Q3 290\n", "2016Q4 390\n", "2017Q1 320\n", "2017Q2 360\n", "2017Q3 310\n", "2017Q4 410\n", "Freq: Q-DEC, dtype: int64" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quarterly_revenue = pd.Series([300, 320, 290, 390, 320, 360, 310, 410], index = quarters)\n", "quarterly_revenue" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAEHCAYAAACp9y31AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3Xd4HOW1+PHv0arLsmSruEiyZCPZ\nwjbuRca4X4IhEHrHhARCKpCQ3AvJTSPlEtIgIfmRECCXXkIJnVC02BhcccOyd23JTXLbVbd62ff3\nh1a+jpva7s6W83kePezOzs6cNbNHo3fOnFeMMSillIocUVYHoJRSKrA08SulVITRxK+UUhFGE79S\nSkUYTfxKKRVhNPErpVSE0cSvlFIRRhO/UkpFGE38SikVYaKtDgAgPT3d5OXlWR2GUkqFlE8//bTS\nGJPR1/cFReLPy8tj/fr1VoehlFIhRUT29ud9OtSjlFIRRhO/UkpFGE38SikVYTTxK6VUhNHEr5RS\nEUYTv1JKRRhN/EopFYL2VjX2+72a+JVSKsQYY7j+kTX9fr8mfqWUCjFl7kYqapr7/X5N/EopFWLs\nDteA3q+JXymlQozd6WLcsOR+v18Tv1JKhZAjLe2s21PNwsI+92Y7ShO/UkqFkI9LK2nvNCwal9nv\nbWjiV0qpEGJ3uEmOj2Z67pB+b0MTv1JKhQhjDHani/kFGcTY+p++NfErpVSIKDlQj+tIKwvH9X98\nHzTxK6VUyPjQ2VXGuXAA4/ugiV8ppUKG3elmUnYKGclxA9qOJn6llAoBNY1tbNxXM+CzfdDEr5RS\nIWHFTjceA4sLNfErpVREsDtcpCXFMikrZcDb6nXiFxGbiGwUkTe8z0eLyBoRKRWR50Uk1rs8zvu8\n1Pt63oCjVEqpCNbpMSzf4WbB2AyiomTA2+vLGf8dwPZjnt8H3G+MyQdqgJu9y28GarzL7/eup5RS\nqp82lddS09TOIh8M80AvE7+IZAOfBx7xPhdgMfCid5XHgUu8jy/2Psf7+hLv+koppfrhQ6eLKIH5\nBQOr3+/W2zP+B4D/Ajze52lArTGmw/u8AsjyPs4CygG8r9d51/83InKriKwXkfVut7uf4SulVPgr\ndriYnjuElMQYn2yvx8QvIhcCLmPMpz7Zo5cx5mFjzAxjzIyMDN/8FlNKqXBzuL6FkgP1PhvmAYju\nxTpzgS+IyAVAPDAY+AOQKiLR3rP6bGC/d/39QA5QISLRQApQ5bOIlVIqgix3do2IDKQb5/F6POM3\nxnzfGJNtjMkDrgGKjTHXA3bgCu9qXwRe9T5+zfsc7+vFxhjjs4iVUiqCFDtcjEiJp3B4/ydeOd5A\n6vjvAu4UkVK6xvAf9S5/FEjzLr8TuHtgISqlVGRq6/CwsrSSheMy8WWNTG+Geo4yxnwIfOh9vAuY\ndZJ1WoArfRCbUkpFtPV7q2lo7WDRALtxHk/v3FVKqSBld7iItUUxNz/dp9vVxK+UUkHK7nQze8xQ\nkuL6NDjTI038SikVhMqrmyh1NfikG+fxNPErpVQQ6p50xdfj+6CJXymlglKxw0VeWiJjMgb5fNua\n+JVSKsi0tHfySVmVX4Z5QBO/UkoFnVW7qmjt8Pi0TcOxNPErpVSQsTtcJMTYmD16qF+2r4lfKaWC\niDGGYoeLuflpxMfY/LIPTfwqpNU2tVkdglI+VeZupKKm2W/j+6CJX4Uwx6F6pv/iff5VcsjqUJTy\nGbvDW8bpp/F90MSvQti7JYfp9BgeW7nb6lCU8hm708W4YclkpSb4bR+a+FXIsjtdiMCa3dXsOHzE\n6nCUGrAjLe2s21PNwkL/Tk6liV+FpKqGVjaV13JjUS6x0VE8tXqv1SEpNWAfl1bS3mlY7MfxfdDE\nr0LUip1ujIHLpmVz4aQRvLxhPw2tHT2/UakgZne4SY6PZlruEL/uRxO/Ckl2h5v0QbGclZXCsqJc\nGlo7eGXj/p7fqFSQMsZgd7qYX5BBjM2/qVkTvwo5HZ0elu9ws2BsJlFRwpScVCZmDeapVXvRWT5V\nqCo5UI/rSKtfq3m6aeJXIWdTeS11ze0s8l4AExGWFeXiPHyEdXtqLI5Oqf7p7sa5YKx/L+yCJn4V\nguxOF7YoYV7B/31BvjA5i8Hx0TypF3lViLI73UzKTiEjOc7v+9LEr0JOscPN9NwhpCTEHF2WEGvj\nyhk5vLP1IK4jLRZGp1Tf1TS2sXFfDYv8XM3TTRO/CimH6lrYfrD+pF+Q62ePor3T8PzacgsiU6r/\nVux04zH+vVv3WJr4VUg5OivRSW5wGZMxiHkF6Tyzdh8dnZ5Ah6ZUv9kdLtKSYpmUlRKQ/WniVyGl\n2OFiZEo844Yln/T1G4pyOVjXwgfefidKBbtOj+mqUhuXQVSUBGSfmvhVyGjt6OTj0koWFmYicvIv\nyJLCTEamxOudvCpkbCqvpaapPWDj+9CLxC8i8SKyVkQ2i0iJiNzjXb5ERDaIyCYRWSki+d7lcSLy\nvIiUisgaEcnz70dQkWL9nhoa2zpP+wWJtkVx3exRfLSzkl3uhgBGp1T/fOitUptf4P8yzm69OeNv\nBRYbYyYDU4ClIlIEPARcb4yZAjwD/NC7/s1AjTEmH7gfuM/3YatIVOxwEWuLYm5+2mnXu2pmDjE2\n4anV+wIUmVL9V+xwMX3UEFISY3pe2Ud6TPymS/epU4z3x3h/BnuXpwAHvI8vBh73Pn4RWCKn+rtc\nqT6wO13MHjOUxNjo066XmRzP0okj+Men5TS1af8eFbwO17dQcqDe7904j9erMX4RsYnIJsAFvGeM\nWQPcArwlIhXAMuBX3tWzgHIAY0wHUAeccIomIreKyHoRWe92uwf+SVRY21vVyC53Y6/HQZcV5XKk\npYPXNx/oeWWlLLLc2ZX7Ajm+D71M/MaYTu+QTjYwS0QmAt8BLjDGZAN/B37flx0bYx42xswwxszI\nyAjsbzsVerpnJVrcyzrnmXlDGDcsmSe0f48KYsUOFyNS4ikcfvIqNX/pU1WPMaYWsAPnA5O9Z/4A\nzwNnex/vB3IARCSarmGgKp9EqyKW3elmdHoSeelJvVpfRFg2J5eSA/VsKq/1c3RK9V1bh4eVpZUs\nHHfqKjV/6U1VT4aIpHofJwDnAtuBFBEZ612texnAa8AXvY+vAIqNnnKpAWhu62TVrioWjuvbX4aX\nTM1iUJz271HBaf3eahpaO1jUx+PaF3pzxj8CsIvIFmAdXWP8bwBfAV4Skc10jfH/p3f9R4E0ESkF\n7gTu9n3YKpJ8UlZJW4en18M83QbFRXPZtCze2HKQ6sY2P0WnVP/Yj1appQd836cvjwCMMVuAqSdZ\n/grwykmWtwBX+iQ6peiq5kmIsTFr9NA+v/eGolyeWLWXF9aX87UFZ/ghOqX6x+50M3vMUJLiekzD\nPqd37qqgZozB7nAzNz+duGhbn98/dlgys0cP5ek1e+n06IijCg7l1U2UuhpYGOBqnm6a+FVQ2+lq\nYH9tc5+HeY61bE4u5dXNrNihZcMqONidfatS8zVN/CqodZdx9vXC7rHOmzCcjOQ4vcirgobd4SIv\nLZHRvaxS8zVN/Cqo2Z0uCocnMzI1od/biLFFce2sUdidLsqrm3wYnVJ919LeySdlVZYN84AmfhXE\n6lvaWb+nxieTU1w7K4coEZ5ao2f9ylqryqpo7UeVmi9p4ldBa+XOSjo8xie3s49ISeDcM4fxwrpy\nWto7fRCdUv0zkCo1X9HEr4KW3eFicHw000al+mR7y+bkUtPUzlufHfTJ9pTqK2MMxQ4Xc/PTiI/p\ne5War2jiV0HJ4zHYnW7mj80g2uabw/TsM9IYk5GkF3mVZcrcDVTUNAdsbt1T0cSvglLJgXoqG1p9\n2rVQRFhWlMvGfbVs3V/ns+0q1Vt2R1dJsZUXdkETvwpSdqcLEVjg4z4ml03LJiHGxpOr9KxfBZ7d\n6WLcsGSyBlCl5gua+FVQKna4mJSdSvqgOJ9uNyUhhkumjuTVzfupa2r36baVOp0jLe2s3V1t+TAP\naOJXQaiqoZXNFbV+61p4Q1EuLe0eXtxQ4ZftK3UyH5d2V6lZP/+IJn4VdFbsdGOM/2YlmjAyhWmj\nUnlq9V482r9HBYjd4SY5PpppuUOsDkUTvwo+xQ436YNiOSsrxW/7uHFOHrsrG/mkTOcIUv5njMHu\ndDF/bAYxPqpSGwjrI1DqGB2dHlbscLNgbCZRUf6blej8s4YzNCmWJ1bt8ds+lOpWcqAe1xHfVqkN\nhCZ+FVQ2lddS19zOokL/joPGRdu4emYO728/zIHaZr/uS6kPvd04F4y1fnwfNPGrIFPscGGLEuYV\n+P8Lct2sURjg2bX7/L4vFdnsTjeTs1PISPZtlVp/aeJXQcXudDM9dwgpCTF+31fO0EQWj8vk2bXl\ntHV4/L4/FZlqGtvYuK/G8pu2jqWJXwWNQ3UtbD9YH9Bx0GVzcqlsaOVfJYcCtk8VWVbsdOMxBEX9\nfjdN/CpoWDEr0fyCDEYNTdQ7eZXf2B0u0pJimeTHKrW+0sSvgobd4WJkSjxjhw0K2D6jooQbikax\ndk81jkP1AduvigydHsPyHW4WjMvwa5VaX2niV0GhtaOTj0srWViYiUhgvyBXTs8hNjqKp7Rrp/Kx\nTeW11DS1B00ZZzdN/CoorN9TQ2NbJ4st+IIMSYrlokkjeWXDfo60aP8e5TsfOruq1OYHoEqtLzTx\nq6BQ7HARa4vi7Pw0S/a/bE4ujW2d/HPjfkv2r8JTscPF9FFDSEn0f5VaX/SY+EUkXkTWishmESkR\nkXu8y0VEfikiO0Rku4jcfszyP4pIqYhsEZFp/v4QKvTZnS5mjxlKYmy0JfufkpPKpOwUnly9F2O0\nf48auMP1LZQcqGehn29G7I/enPG3AouNMZOBKcBSESkCbgJygEJjzJnAc971zwcKvD+3Ag/5OmgV\nXvZWNbLL3Wjp5NPQ1bVzx+EG1uyutjQOFR6WO7smXbH6uD6ZHhO/6dLgfRrj/THA14GfGWM83vVc\n3nUuBp7wvm81kCoiI3wfugoXdkfXoWP1BbCLJo0kJSFGp2bspeU73PzijW3a4fQUih0uRqTEM25Y\nstWhnKBXY/wiYhORTYALeM8YswY4A7haRNaLyNsiUuBdPQsoP+btFd5lx2/zVu9717vd7oF9ChXS\n7E43o9OTyEtPsjSOhFgbV07P5l9bD+Gqb7E0lmC3r6qJbz29gUdW7uZlvS5ygrYODytLK1k4LvBV\nar3Rq8RvjOk0xkwBsoFZIjIRiANajDEzgL8Bj/Vlx8aYh40xM4wxMzIygm8MTAVGc1snq3ZVWX62\n3+36olw6PIbn1pX3vHKEau/0cPtzG0HgzBGDue8dBw2tHVaHFVTW762mobUjKId5oI9VPcaYWsAO\nLKXrTP5l70uvAJO8j/fTNfbfLdu7TKkTfFJWSVuHx+/dOHtrdHoS8wrSeWbNPjo6tX/Pydz/3g42\nldfyq8smce9lZ+E+0sqDxTutDiuo2Lur1M6wpkqtJ72p6skQkVTv4wTgXMAB/BNY5F1tAbDD+/g1\n4EZvdU8RUGeMOejzyFVYsDtdJMbamDV6qNWhHHXjnDwO1bfw/vbDVocSdD4preSh5WVcMzOHz08a\nwZScVC6fls1jK3ezu7LR6vCCht3pZvaYoSTFWVOl1pPenPGPAOwisgVYR9cY/xvAr4DLReQz4F7g\nFu/6bwG7gFK6hoC+4fOoVVgwxmB3uJmbn05ctM3qcI5aXJhJVmqCXuQ9TnVjG99+fhNj0pP48UXj\njy6/a+k4Ym1R/PLNbRZGFzzKq5sodTUEzfDlyfT468gYswWYepLltcDnT7LcAN/0SXQqrO10NbC/\ntplvLsq3OpR/Y4sSrps9it/8y0mpq4H8zMD1DgpWxhj+8x+bqW1q5+9fmvlv91tkDo7ntiUF/Opt\nR1dfmiCZbMQq3c0Gg6kb5/H0zl1lme4yzoXjgi9RXDUjhxib8PQaPesHePyTPXzgcPH9CwqZMPLE\nLpNfmptHXloiP3u9hPYIvzZid7jIS0tktMVVaqejiV9Zxu50UTg8mZGpCVaHcoKM5DjOnziCFz+t\noKktsitWth2o53/ecrC4MJObzs476Tpx0TZ++PnxlLkbeSKCW1y3tHfySVlVUJ/tgyZ+ZZH6lnbW\n76kJ6i/IjXNyOdLSwWubDlgdimWa2jq47dkNpCbG8JsrJp22Jn3JmZnMH5vBA+/voKqhNYBRBo9V\nZVW0dniCenwfNPEri6zcWUmHxwT1F2R67hAKhyfzxKrI7d/z8ze2sauykfuvnkLaoNPPFysi/PjC\nM2lu6+S37+447brhyu50kRATXFVqJ6OJX1nC7nAxOD6aaaNSrQ7llESEZXNy2Xawng37aq0OJ+De\n3HKQZ9eW87UFZzA3P71X78nPTObGOXk8t24fW/fX+TnC4GKModjhYm5+OvExwVOldjKa+FXAeTwG\nu9PN/LEZRNuC+xC8ZEoWg+KiI26SloqaJu5+eQuTc1K589yxfXrvHf9RwJDEWO55vSSi/lIqczdQ\nUdMcNDcjnk5wf+tUWCo5UE9lQ2tQD/N0S4qL5vJpWby55WDEjFt3dHq447lNGAMPXjOVmD7+ck5J\niOF7nxvHuj01vLElcu7dtDu6eo4tDIHjWhO/Cji704UILAjCMs6TuaEol7ZODy+sr7A6lID44wc7\n+XRvDb+8dCKj0hL7tY2rZ+YwYeRg7n1rO81tnT6OMDh1V6llBWGV2vE08auAK3a4mJSdSnoPFwuD\nRcGwZOaMSeOp1XvpDPMWxKt3VfEneylXTM/m4iknNNXtNVuU8JOLJnCgroW/LC/zYYTB6UhLO2t3\nV4fE2T5o4lcBVtXQyuaKWhaFyNl+t2Vzctlf28yHTlfPK4eomsY2vvP8JnLTkrjnCxMGvL1Zo4dy\n4aQR/GV5GRU1TT6IMHh9XNpdpRYax7UmfhVQK3a6MSY4ZyU6nXPHDyMzOS5s+/cYY7jrpS1UNrTy\nx2um+qy52A8uOBMRuPcth0+2F6zsDjfJ8dFMzx1idSi9oolfBVSxw036oFgmnuS2/2AWY4vi2lmj\nWL7Dzd6q8OtC+dSafby77TB3LS3krGzf/b8ZmZrA1xfk8+ZnB1m9q8pn2w0mxhjsTldIVKl1C40o\nVVjo6PSwYoebBWMziYoKvlmJenLtrFFEifDMmn1Wh+JTzkNH+MUb21gwNoMvzx3t8+3fOn8MWakJ\n3PP6trC8RlJyoB7XkdCoUuumiV8FzKbyWuqa20NumKfb8JR4zpswjOfXl9PSHh6VKi3tndz27AaS\n42P47ZWT/fILOSHWxg8uOJPtB+t5dm14/dIEjl73CcZmg6eiiV8FTLHDhS1KOKegd3eBBqMbinKp\nbWoPm/r0X7y5jR2HG/j9VZPJSPZfldUFZw1n9uih/O5dJ3VN7X7bjxWKHS4mZ6eETJUaaOJXAWR3\nupmeO4SUhBirQ+m3OWPSOCMjKSwu8r6z9RBPrd7HrfPHMN/PPfRFuso765rbuf/98OnjU93Yxsby\n2pAp4+ymiV8FxKG6FrYfrA/ZYZ5uIsKyolw2l9eypSJ0+/ccqG3mrpe2cFZWCt/73LiA7HP8yMFc\nO2sUT67ey47DRwKyT3/7KESr1DTxq4A4OitRiJ0Zncxl07NJiLGFbP+eTo/h289voqPTwx+vnUps\ndODSwHc/N46kWBs/f2NbWPTxKXa4SEuK5ays0KpS08SvAsLucDEyJZ6xw0J/GsPB8TFcMjWLVzcd\nCMnx6j/bS1m7u5qfXzIx4LNEDU2K5TvnjuWjnZW8ty20J7Pv9JiuqSbHZYRclZomfuV3rR2dfFxa\nyaLCzNNO5BFKlhXl0trh4R+fllsdSp+s31PNA+/v4NKpWVw2LduSGG4oyqUgcxC/eHN7SFdHbSqv\npbYpNKvUNPErv1u3u4bGts6wGObpNn7kYGbkDuGp1XvxhEhtel1TO3c8t4nsIYn87OKBt2Torxhb\nFD+5aAL7qpt47OPdlsUxUHZvldq8/NAp4+ymiV/5nd3pIjY6irPz06wOxaeWzcllT1UTK0srrQ6l\nR8YY7n55C4frW/jjtVNJjre2suqcgnTOHT+MPxWXcri+xdJY+svudDF91BBSEkOvSk0Tv/I7u9NF\n0Zg0EmN90/8lWCydOJy0pNiQKO18bl05b289xPfOG8eUnOCY9eyHnz+Tjk7DfW+HXh+fw/UtlByo\nD+o5o09HE7/yq71VjexyN4ZM18K+iIu2cfXMHD7Yfpj9tc1Wh3NKOw8f4Z7XSzgnP51b542xOpyj\nctOSuGXeaF7euJ8N+2qsDqdPuu/WDYXZtk6mx8QvIvEislZENotIiYjcc9zrfxSRhmOex4nI8yJS\nKiJrRCTP92GrUGF3hE8Z58lcN3sUAM+sCc6z/q6WDBtJio3m91f5pyXDQHxjUT6ZyXHc81pJyFwr\nga5unCNS4hk3LNnqUPqlN2f8rcBiY8xkYAqwVESKAERkBnB8H9KbgRpjTD5wP3CfD+NVIcbudDMm\nPYm8AJcNBkr2kEQWFw7j+XXltHYEX4XKr9524Dh0hN9eOZnMwfFWh3OCQXHR3H1+IZsr6nhpQ2jM\ncNbW4WFliFep9Zj4TZfuM/oY748RERvwG+C/jnvLxcDj3scvAkskVP911IA0tXWwaldVyN3O3lfL\n5uRS2dDGO1sPWR3Kv3l/22H+95M9fHnu6KAei75kShZTR6Vy3ztOjrQE/30R6/dW09DaEdJ/xfZq\njF9EbCKyCXAB7xlj1gDfAl4zxhzfrSoLKAcwxnQAdcAJ5RwicquIrBeR9W63eyCfQQWpVWVVtHV4\nQnYctLfm5aeTm5YYVHfyHqpr4T9f3Mz4EYO56/zAtGToryjvNI2VDa38yV5qdTg9sjtcxNqiOPuM\n0K1S61XiN8Z0GmOmANnALBGZD1wJPNjfHRtjHjbGzDDGzMjICO/EEKnsTheJsTZmjR5qdSh+FRUl\n3DA7l3V7ath+sN7qcOj0GL7z/CZa2j08eN1U4qJtVofUoyk5qVwxPZvHVu5md2VwT3Rjd7qZPWao\nz2Yps0KfqnqMMbWAHVgE5AOlIrIHSBSR7l/V+4EcABGJBlKA8Jx6R52SMQa7w83c/PSQSDwDdeWM\nbOKio4LirP8vy8tYtauKey6ewBkZodMi47+WjiPWFsUv39xmdSinVF7dRKmrIaSHeaB3VT0ZIpLq\nfZwAnAt8aowZbozJM8bkAU3ei7kArwFf9D6+Aig24dCNSfXJTlcD+2ubQ/4L0lupibF8YfJIXtm4\nn3oLx6k37Kvh9+/t4KLJI7lyujUtGforMzme25YU8P52F8t3BOfw79Fmg0F8zaQ3enPGPwKwi8gW\nYB1dY/xvnGb9R4E0718AdwJ3DzxMFWqOlnGG+fj+sZbNyaWprZNXNuy3ZP/1Le3c/uxGRqTE88tL\nJ4ZkxcmX5uaRl5bIz14vob3TY3U4J7A7XIxOTwp4cztf601VzxZjzFRjzCRjzERjzM9Oss6gYx63\nGGOuNMbkG2NmGWN2+TpoFfyKHS4KhyczIiXB6lACZlJ2KpOzU3hy9d6Atxw2xvCDlz/jYF0Lf7hm\nKoMtbsnQX3HRNn504XjK3I08scr6YbNjtbR38klZVUhNsXgqeueu8rn6lnbW760J+T+H++OGolxK\nXQ2s3lUd0P3+49MK3thykDvPHcv03ONvrQktiwszmT82gwfe30FVQ6vV4Ry1qqyK1g5PWAxfauJX\nPrdyZyWdHhOS7WoH6qLJI0lNjAnoRd4ydwM/fa2EOWPS+NqCMwK2X38REX584Xia2zr57btOq8M5\nyu50kRBjY/aY0K9S08SvfK7Y4WJwfDRTg6QZWCDFx9i4akYO/yo5FJCuk60dndz+7EbioqO4/+op\n2IKsJUN/5WcO4otn5/HcunK27q+zOhyMMRQ7XGFTpaaJX/mUx2P40Olm/tgMom2ReXhdP3sUHR7D\ns2v3+X1fv37HScmBen5zxWSGpwRfS4aBuH1JAUMTY7nn9RLLp2ksczdQUdMcNsUKkfnNVH5TcqCe\nyobWiBzm6ZablsSCsRk8u3afXytT7A4Xj67czRfn5PIf44f5bT9WSUmI4XvnjWPdnhre2HJ8g4DA\nsju6ykvDYXwfNPErHyt2uBCB+WPD48yov5YV5XK4vpX3/TSvrKu+he/9YzOFw5P5/gVn+mUfweCq\nGTlMGDmYe9/aTnObdU3w7M6uKrWRqeFRpaaJX/mU3eliUnYq6YPirA7FUosKM8lKTfDLJC0ej+G7\n/9hMY1sHD147lfiY0B9zPhWbt4/PgboWHlpeZkkMR1raWbu7OqyaDWriVz5T1dDK5opaFofRF6S/\nbFHCdbNH8UlZFaWuIz7d9t8+2sVHOyv5yUUTKAjRfvB9MWv0UC6aPJK/Li+joqYp4Pv/uLSSjjCr\nUtPEr3xm+Q43xkTW3bqnc/XMHGJtUTy12ncXeTeX1/Kbfzm54KzhXDMzx2fbDXbfP78QEbj3rcBP\n02h3uEmOj2baqPCpUtPEr3zG7nSTPiiOiSNTrA4lKKQPiuOCs4bz0qcVNLZ2DHh7R1rauf25jQwb\nHM+9l04KyZYM/TUyNYGvL8jnzc8OsqoscD0fjTHYna6wq1ILn0+iLNXR6WHFDjcLx2UE3fR+Vlo2\nJ5cjrR28uunAgLf141dLKK9u4oFrppCSGJotGQbiqwvGkJWawD2vl9AZoGkaSw7U4zrSGnbDl5r4\nlU9sLK+lrrk9bMrdfGXaqCGcOWIwT6zaM6Ba9Jc3VPDKxv3csWQsM/NC/87R/oiPsfGDC87EcehI\nQO6RgP+bVH1BGPTnOZYmfuUTdocLW5RwTkG61aEEFRFhWVEujkNH2LCvpl/b2FPZyI/+uZVZo4fy\nrcX5Pb8hjF1w1nBmjx7K7951Utfk//bXxQ4Xk7NTwq5KTRO/8gm7082M3CGkJETeEERPLp4ykuS4\n6H51m2zr8HD7cxuJtkXxQBi1ZOgvka7yzrrmdu5/f4df91Xd2MbG8tqwbDaoiV8N2MG6ZrYfrA/L\nL4gvJMVFc/n0bN767CCVfew2+bt3nWypqOO+yyeFzc1DAzV+5GCunTWKJ1fvZcdh35bKHuujnd4q\ntTAcvtTErwbsQ2d43c7uDzdNrH76AAAX00lEQVQU5dLeaXh+XXmv37Nih5u/rtjF9bNHsXTicD9G\nF3q++7lxJMXa+Nnr2/zWx6fY4SJ9UCxnZYVflZomfjVgdoeLkSnxjB0WOvO7Blp+5iDOPiONZ9bs\n61VFSmVDK3e+sJmxwwbxowvHByDC0DI0KZY7zx3LytJK3vNDW4xOj2H5DjcLxmaGZZWaJn41IK0d\nnawsrWRRYWZE1ZX3x7KiXPbXNh+dlvJUPB7Dd1/YzJGWdh68dlpYt2QYiOuLcinIHMQv3txOS7tv\n+/hsKq+ltqk9bG9G1MSvBmTd7hqa2jp1mKcX/mP8MIYNjuOJHvr3PPbxbpbvcPPDC8czbnj4t2To\nrxhbFD+5aAL7qpt4dOVun267u0ptXoEmfqVOYHe6iI2O4uz8NKtDCXoxtiium5XLih1u9lQ2nnSd\nrfvruO8dB58bP4wbZo8KcISh55yCdD43fhh/tpf6dOIbu9PF9DCuUtPErwbE7nBRNCaNxNhoq0MJ\nCdfMyiE6Snh6zYln/Y2tHdz27EbSkuK47/LIaskwEP/9+TPp6DTc97Zv+vgcrm+h5EB9WP8Vq4lf\n9dueykZ2VTayKMzuavSnYYPjOW/CcF5YX3HCuPRPXithT1UjD1wzhSFJsRZFGHpy05K4Zd5oXt64\nv983yR2r+27dcB3fB038agCOfkHC+MzIH24oyqWuuZ3XN/9f/55XN+3nxU8ruG1RPkVjdNisr765\nKJ/M5Djuea0EzwD7+NgdbkamxDMujFtea+JX/VbsdDMmPYm89CSrQwkpRWOGkp85iKe8F3n3VTXx\nw1e2Mj13CLcvKbA4utCUFBfN3ecXsrmijpc2VPR7O20dHlaWVrIwzKvUekz8IhIvImtFZLOIlIjI\nPd7lT4uIU0S2ishjIhLjXS4i8kcRKRWRLSIyzd8fQgVeU1sHq3dVhdWsRIHS3b9nc0Udn+6t5vbn\nNoLAH66ZElatfwPtkilZTB2Vyn3vODnS0r8+Puv3VNPQ2hH2f8X25ihrBRYbYyYDU4ClIlIEPA0U\nAmcBCcAt3vXPBwq8P7cCD/k6aGW9VWVVtHV4wnoc1J8um5ZFYqyNWx5fz6byWn512SSyhyRaHVZI\ni4oSfnrRBCobWvmTvbRf27A7XcTaopgb5lVqPSZ+06XB+zTG+2OMMW95XzPAWiDbu87FwBPel1YD\nqSIywh/BW+1AbTNffXI9v3vX6bfbxoNVscNFYqyNWaMjs0XwQCXHx3Dp1Cxqmtq5ZmYOn58Ull+R\ngJuck8oV07N5bOVudp+iZPZ07E43s8cMDfsqtV79XSkiNhHZBLiA94wxa455LQZYBrzjXZQFHNuQ\npMK77Pht3ioi60Vkvdvt7m/8ljDG8PKGCs57YAXvb3fxYHEp970TOcnfGMOHTjdz89OJi9a7Svvr\n9iUF3L6kgB9fpC0ZfOm/lo4jLtrGL97Y1qf3lVc3UepqCPthHuhl4jfGdBpjptB1Vj9LRCYe8/L/\nA1YYYz7qy46NMQ8bY2YYY2ZkZITOcEF1YxvfeHoDd76wmcLhyRR/dwHXzx7FX5aX8YcPdlodXkDs\ndDWwv7Y5Ir4g/jRscDx3njs27M8uAy0zOZ7bFufzgcN1tPKsN+zedcNpUvVT6dMRZ4ypFRE7sBTY\nKiI/ATKArx6z2n7g2Fmgs73LQt4H2w9z10ufUd/czt3nF/KVeWOwRQk/v3girR0eHnh/J7HRUXxj\nYXhPllHsCP86ZxXabpqbx7Nr9/HzN7YxNz+dmF5cNLc7XIyOkCq13lT1ZIhIqvdxAnAu4BCRW4Dz\ngGuNMZ5j3vIacKO3uqcIqDPGHPRD7AHT0NrB3S9t4ebH15M+KJZXvzWXry044+ikGFFRwn2XT+IL\nk0fy63ecPu8bEmzsDheFw5MZkaL94VVwiou28aMLx1PmbuTxT/b0uH5zWyeflFWxMEJuRuzNGf8I\n4HERsdH1i+IFY8wbItIB7AVWeetdXzbG/Ax4C7gAKAWagC/5JfIAWbu7mu/+YxP7a5r5xsIzuOM/\nCk46rm2LEn5/1WTaOjz8/I1txEVHcUNRrgUR+1d9Szvr99Zw6/wxVoei1GktLsxkwdgM/vDBTi6Z\nmnXa6RNX76qitcMTEcM80IvEb4zZAkw9yfKTvtdb5fPNgYdmrdaOTn7/7g4e/mgXOUMSeeGrc5jR\nwyTX0bYo/njtVL721Kf88J9biY2O4qoZOad9T6j5aEclnR4TMV8QFbpEhB9dOJ6lD6zgd+86ufey\nSadc1+50kRATOVVqerfISZQcqOMLD37MX1fs4tpZo3j7jnk9Jv1usdFR/L/rpzGvIJ27XtrCq5vC\n4vLGUXani8Hx0UzNSbU6FKV6lJ85iC+encdz68rZur/upOsYYyh2uCKqSk0T/zE6Oj382V7KJX/+\nmOqmNv7+pZn8z6VnkRTXt6qL+BgbDy+bway8odz5wmbe/iykL3Ec5fF0lXHOH5uhd5iqkHH7kgKG\nJsZyz+slJy25LnM3UFHTHFF/xeq312tPZSNX/XUVv/mXk8+NH867354/oHLFhFgbj900k8nZKdz+\n3EY+2O776eECbeuBOiobWiPqC6JCX0pCDN87bxzr9tTw+pYTT8Lsjq77iCLlwi5o4scYw1Or93L+\nHz6i1NXAH66Zwp+um+qTtrhJcdH875dnceaIwXz9qQ2s2BFaN6odz+5wIwLzx0bOF0SFh6tm5DBh\n5GDufWs7zW3/3g7b7uyqUhuZGjlVahGd+A/Xt3DT39fxw39uZUbeEN79zgIunpLl0658g+NjeOLL\nsxiTkcStT65nVVmVz7YdaHani0nZqaetjlAqGNmihJ9+YQIH61p4aHnZ0eVHWtpZu7uaRRH2V2zE\nJv7XNx/gc/evYM3uKn5+8QSe+PIshqfE+2VfqYmxPH3LbHKGJHLz4+v4dG+1X/bjT1UNrWyuqGWx\n3q2rQtTMvKFcNHkkf11eRkVNEwAfl1bS4TERdxd6xCX+2qY2bn92I7c9u5HR6Um8dfs8ls3J83vv\n7bRBcTx9y2wyk+O46bF1bKmo9ev+fG35DjfG6N26KrR9//xCRODet7qmabQ73AyOj2baqMiqUouo\nxL98h5vzHljBW58d5HufG8uLX5vDmIxBAdt/5uB4nvlKESmJMSx7dC3bDtQHbN8DZXe6SR8Ux8SR\nKVaHolS/jUxN4BsL83nzs4N8UlaJ3emKyCq1iPi0TW0d/PCfn/HFx9YyOD6Gf35zLt9aXGDJ/+yR\nqQk8+5UiEmNt3PDoGnYePhLwGPqqo9PDih1uFo7LICoqfGclUpHh1vljyEpN4I7nNuE60hpxwzwQ\nAYn/0701XPCHj3h6zT6+Mm80r992DhOzrD1rzRmayDNfKcIWJVz3yJp+9Q0PpI3ltdQ1t0fkF0SF\nn/gYG//9+TNxH2lFBBZEUBlnt7BN/G0dHn7zLwdX/uUT2jsNz36liP/+/HjiY4LjzrzR6Uk8c8ts\nOj2G6/62mvLqJqtDOiW7w4UtSjinIN3qUJTyifMnDmdeQTpFo9MiskotLBO/89ARLvnzx/zZXsYV\n07N559vzKBoTfFOpFQxL5qmbZ9PU1sm1f1vNgdpmq0M6KbvTzYzcIaQkxFgdilI+ISI8dtNMHv/y\nLKtDsURYJf5Oj+HhFWVc9OBKDte38PCy6fz6iskkxwdvwho/cjBP3jyLuqZ2rvvbalz1LVaH9G8O\n1jWz/WB9xNU5q/AXY4siNjqsUmCvhc2nLq9u4tq/reZ/3nKwcFwG//rOfD43YbjVYfXKpOxU/vfL\ns3AdaeW6R9ZQ2dBqdUhHfejsuttYx/eVCh8hn/iNMTy/bh9LH1jB9gP1/PbKyfx12fSQG7ebnjuE\nx26aSUVNEzc8sobapjarQwK6xvezUhMYOyxwZa9KKf8K6cTvPtLKV55Yz10vfcak7FTe/vY8rpie\n7febsfylaEwaf7txBrsqG1n26FrqW9otjae1o5OVpZUsHJcRsv+mSqkThWzif2frQc57YAUrdlby\nowvH8/Qts8kekmh1WAM2ryCDh66fhuNQPTc9tpaG1g7LYlm3u4amtk4d5lEqzIRc4q9vaefOFzbx\ntac2kJWawJu3ncPN54wOqxuLlpw5jAevncrmijq+/L/rTugmGCh2p4vY6CjOzg++iiilVP+FVOL/\nuLSSpfev4NVNB7h9SQEvf+NsCoYlWx2WXyydOILfXzWZdXuqufXJ9bS0Bz752x0uisakkRjbt4lo\nlFLBLSQSf0t7J/e8XsL1j6whPsbGS18/mzvPHUtMmPfXuHhKFr++fBIf7azkG09voK3DE7B976ls\nZFdlI4si8K5GpcJd0J/Kbamo5TvPb6LM3chNZ+dx19JCEmKD4+7bQLhyRg5tnR7++5Wt3PbsBv50\n3bSA/ML70OkCtIxTqXAUtIm/3Tv/7YPFpWQmx/HUzbMjtmXA9bNzaW338LM3tnHnC5t54Oop2Px8\nTaPY6WZMehJ56Ul+3Y9SKvCCMvGXuhq484VNbKmo49KpWfz0CxMivl3Al88ZTWuHh/vecRAXHcWv\nL5/ktwvaTW0drN5VxQ2zc/2yfaWUtYIq8Xs8hsdX7eFXbztIjLXx/66fxgVnjbA6rKDx9YVn0NrR\nyQPv7yQ2OopfXjLRL/X1q8qqaOvw6KTqSoWpHhO/iMQDK4A47/ovGmN+IiKjgeeANOBTYJkxpk1E\n4oAngOlAFXC1MWZPT/vZX9vMf/5jM5+UVbG4MJNfXX4Wmcn+mQoxlN2xpIDWDg8PfVhGXHQUP75w\nvM+Tf7HDRWKsjZmjh/h0u0qp4NCbM/5WYLExpkFEYoCVIvI2cCdwvzHmORH5C3Az8JD3vzXGmHwR\nuQa4D7j6dDuobWpj6f0r8BjDry47i6tn5uidoqcgIvzXeeNobffw2Me7iYu2cdfScT779zLG8KHT\nzdz8dOKiI+ciulKRpMfyENOlwfs0xvtjgMXAi97ljwOXeB9f7H2O9/Ul0kNWKq9ppnBEMm/fMZ9r\nZo3SpN8DEeFHF57J9bNH8ZflZfzhg50+2/ZOVwP7a5t1mEepMNarMX4RsdE1nJMP/BkoA2qNMd39\nBCqALO/jLKAcwBjTISJ1dA0HVR63zVuBWwHSskbz3K1z/F6pEk5EhJ9fPJHWDs/RMf9vLMwf8HaL\nHV1lnAu1fl+psNWrxG+M6QSmiEgq8ApQONAdG2MeBh4GmDFjhtGk33dRUcJ9l0+ircPDr99xEhdt\n4+ZzRg9om3aHi8LhyYxISfBRlEqpYNOnqh5jTK2I2IE5QKqIRHvP+rOB/d7V9gM5QIWIRAMpdF3k\nVX5gixJ+f9Vk2js9/PyNbcRFR3FDUf/KMOtb2lm/t4avzh/j4yiVUsGkxzF+EcnwnukjIgnAucB2\nwA5c4V3ti8Cr3seveZ/jfb3YGGN8GbT6d9G2KP5wzVSWFGbyw39u5YX15f3azkc7Kun0GJ1tS6kw\n15t7/0cAdhHZAqwD3jPGvAHcBdwpIqV0jeE/6l3/USDNu/xO4G7fh62OFxsdxZ+vn8a8gnTuemkL\nr27a3/ObjmN3ukhJiGFqTqofIlRKBYseh3qMMVuAqSdZvgs4YaZiY0wLcKVPolN9Eh9j4+FlM7jp\n72u584XNxNqiOL+XN8B5PF1lnPPHZhAd5s3vlIp0+g0PMwmxNh67aSZTclK57dmNfLD9cK/et/VA\nHZUNrdqNU6kIoIk/DCXFRfP3L81k/MjBfP2pDazY4e7xPXaHGxFYMFYTv1LhThN/mBocH8MTX57F\nmIwkbn1yPavKTl9YZXe6mJydSlqITVKvlOo7TfxhLDUxlqdvmU3OkERufnwdn+6tPul6VQ2tbK6o\n1d77SkUITfxhLm1QHE/fMpthg+O56bF1bKmoPWGd5TvcGAOLCnWYR6lIoIk/AmQOjufpW2aTkhjD\nskfXsu1A/b+9bne6SR8Ux8SRKRZFqJQKJE38EWJkagLPfqWIxFgbNzy6hp2HjwDQ0elhudPFwnEZ\nfpvYRSkVXDTxR5CcoYk885UibFHCdY+sYZe7gY3ltdS3dOj4vlIRRBN/hBmdnsQzt8zG4zFc97c1\nPLV6L7YoYd7YyJzPWKlIpIk/AhUMS+bJm2fT3N7Jq5sOMCN3CIPjI3tOY6UiiSb+CDV+5GCevHkW\n6YPiuHx6ttXhKKUCKKgmW1eBNSk7lbU/WKIXdZWKMHrGH+E06SsVeTTxK6VUhNHEr5RSEUYTv1JK\nRRhN/EopFWE08SulVITRxK+UUhFGE79SSkUYMcZYHQMi0gyUWB3HAKQAdVYHMQAav3VGAfusDmIA\nQvnfHkI//gJjTJ/7qQfLnbsNxpgZVgfRXyLysDHmVqvj6C+N3zoi4tZj3zrhEH9/3hcsQz0nTgsV\nWl63OoAB0vito8e+tSIy/mAZ6lkfymc9SvWXHvvKCsFyxt+vP1eUCgN67KuAC4ozfqWUUoETLGf8\nSimlAkQTfx+ISLaIvCoiO0Vkl4j8SUTiRORcEflURD7z/nex1bGezGninyUim7w/m0XkUqtjPZlT\nxX/M66NEpEFEvmdlnOFKj3/r+PrYD3jiP80/fpqI2L3B/ynQcfVERAR4GfinMaYAKAASgF8DlcBF\nxpizgC8CT1oW6Cn0EP9WYIYxZgqwFPiriARLqS/QY/zdfg+8bUF4vRKqxz7o8W8lfxz7AU38PXyA\nFuBHQLCerS0GWowxfwcwxnQC3wFuBHYaYw541ysBEo79bRwkThd/lDGmw7tePBCMF35OGb+IDBKR\nS4DdBOmNgCF+7IMe/1by+bEf6DP+0/3jizFmJV1fgmA0Afj02AXGmHpgD5B/zOLLgQ3GmNbAhdYr\np41fRGaLSAnwGfC1Y74IweJ08U8B7gLuCXxYvRbKxz7o8W8lnx/7gU78vT14QpKITADuA75qdSx9\nZYxZY4yZAMwEvi8i8VbH1Ac/Be43xjRYHchphPWxD3r8W+Sn9OPY14u7vbcNmH7sAhEZDAwHnCKS\nDbwC3GiMKbMgvp6cNv7uZcaY7UADMDGg0fXsdPGnAL8WkT3At4EfiMi3Ah5heNPj3zo+P/YDnfh7\n9Y8fpD4AEkXkRgARsQG/A/4ExAFvAncbYz62LsTTOl38w7svZolILlBI15loMDll/MaYmcaYPGNM\nHvAA8D/GmGC7SBrKxz7o8W8lnx/7gU78p/sAzQGOpU9M151ulwJXiMhOoArwGGN+CXyLrj/Xf3xM\nWVimheGeoIf4zwE2i8gmus7avmGMqbQu2hP1EH8oCNljH/T4t5Jfjn1jTEB/gBzgNWAnXQ2q/nrM\na3uAarr+1KoAxgc6vj58jrOBvcA0q2PR+K2Pp5cxh8WxH6r//uESvy9it7Rlg4icDTwLXGqM2WBZ\nIEoFmB77ykraq0cppSKMVvUopVSE0cSvlFIRxi+JX0RyvL1HtolIiYjc4V0+VETe8/YqeU9EhniX\nF4rIKhFpPb7JkIikisiLIuIQke0iMscfMSvlC7469kVk3DEVMptEpF5Evm3V51LhxS9j/CIyAhhh\njNkgIsl03bF4CXATUG2M+ZWI3A0MMcbc5S39yvWuU2OM+e0x23oc+MgY84iIxAKJxphQn65OhSlf\nHvvHbNMG7AdmG2P2BuqzqPDllzN+Y8zB7koFY8wRYDuQBVwMPO5d7XG6DnaMMS5jzDqg/djtiEgK\nMB941LtemyZ9Fcx8dewfZwlQpklf+Yrfx/hFJA+YCqwBhhljDnpfOgQM6+HtowE38HcR2Sgij4hI\nkr9iVcqXBnjsH+sauko/lfIJvyZ+ERkEvAR823Q1pDrKdI0x9TTOFA1MAx4yxkwFGoG7/RGrUr7k\ng2O/ezuxwBeAf/g8SBWx/Jb4RSSGrgP/aWPMy97Fh71joN1joa4eNlMBVBhj1nifv0jXLwKlgpaP\njv1u59PV5viw7yNVkcpfVT1C17j8dmPM74956TW6ZujB+99XT7cdY8whoFxExnkXLaGr2ZVSQclX\nx/4xrkWHeZSP+auq5xzgI7omNfB4F/+ArrHOF4BRdPWauMoYUy0iw4H1wGDv+g109SqpF5EpwCNA\nLLAL+JIxpsbnQSvlAz4+9pOAfcAYY0xdYD+JCmfaskEppSKM3rmrlFIRRhO/UkpFGE38SikVYTTx\nK6VUhNHEr5RSEUYTv1JKRRhN/EopFWH+PzmgvHC4jUWRAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "quarterly_revenue.plot(kind=\"line\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can convert periods to timestamps by calling `to_timestamp`. By default this will give us the first day of each period, but by setting `how` and `freq`, we can get the last hour of each period:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016-03-31 23:00:00 300\n", "2016-06-30 23:00:00 320\n", "2016-09-30 23:00:00 290\n", "2016-12-31 23:00:00 390\n", "2017-03-31 23:00:00 320\n", "2017-06-30 23:00:00 360\n", "2017-09-30 23:00:00 310\n", "2017-12-31 23:00:00 410\n", "Freq: Q-DEC, dtype: int64" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "last_hours = quarterly_revenue.to_timestamp(how=\"end\", freq=\"H\")\n", "last_hours" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And back to periods by calling `to_period`:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2016Q1 300\n", "2016Q2 320\n", "2016Q3 290\n", "2016Q4 390\n", "2017Q1 320\n", "2017Q2 360\n", "2017Q3 310\n", "2017Q4 410\n", "Freq: Q-DEC, dtype: int64" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "last_hours.to_period()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas also provides many other time-related functions that we recommend you check out in the [documentation](http://pandas.pydata.org/pandas-docs/stable/timeseries.html). To whet your appetite, here is one way to get the last business day of each month in 2016, at 9am:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "PeriodIndex(['2016-01-29 09:00', '2016-02-29 09:00', '2016-03-31 09:00',\n", " '2016-04-29 09:00', '2016-05-31 09:00', '2016-06-30 09:00',\n", " '2016-07-29 09:00', '2016-08-31 09:00', '2016-09-30 09:00',\n", " '2016-10-31 09:00', '2016-11-30 09:00', '2016-12-30 09:00'],\n", " dtype='period[H]', freq='H')" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "months_2016 = pd.period_range(\"2016\", periods=12, freq=\"M\")\n", "one_day_after_last_days = months_2016.asfreq(\"D\") + 1\n", "last_bdays = one_day_after_last_days.to_timestamp() - pd.tseries.offsets.BDay()\n", "last_bdays.to_period(\"H\") + 9" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# `DataFrame` objects\n", "A DataFrame object represents a spreadsheet, with cell values, column names and row index labels. You can define expressions to compute columns based on other columns, create pivot-tables, group rows, draw graphs, etc. You can see `DataFrame`s as dictionaries of `Series`.\n", "\n", "## Creating a `DataFrame`\n", "You can create a DataFrame by passing a dictionary of `Series` objects:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843.0Dancing83
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3.0 Dancing 83\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people_dict = {\n", " \"weight\": pd.Series([68, 83, 112], index=[\"alice\", \"bob\", \"charles\"]),\n", " \"birthyear\": pd.Series([1984, 1985, 1992], index=[\"bob\", \"alice\", \"charles\"], name=\"year\"),\n", " \"children\": pd.Series([0, 3], index=[\"charles\", \"bob\"]),\n", " \"hobby\": pd.Series([\"Biking\", \"Dancing\"], index=[\"alice\", \"bob\"]),\n", "}\n", "people = pd.DataFrame(people_dict)\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A few things to note:\n", "* the `Series` were automatically aligned based on their index,\n", "* missing values are represented as `NaN`,\n", "* `Series` names are ignored (the name `\"year\"` was dropped),\n", "* `DataFrame`s are displayed nicely in Jupyter notebooks, woohoo!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can access columns pretty much as you would expect. They are returned as `Series` objects:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice 1985\n", "bob 1984\n", "charles 1992\n", "Name: birthyear, dtype: int64" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people[\"birthyear\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also get multiple columns at once:" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearhobby
alice1985Biking
bob1984Dancing
charles1992NaN
\n", "
" ], "text/plain": [ " birthyear hobby\n", "alice 1985 Biking\n", "bob 1984 Dancing\n", "charles 1992 NaN" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people[[\"birthyear\", \"hobby\"]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you pass a list of columns and/or index row labels to the `DataFrame` constructor, it will guarantee that these columns and/or rows will exist, in that order, and no other column/row will exist. For example:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearweightheight
bob1984.083.0NaN
alice1985.068.0NaN
eugeneNaNNaNNaN
\n", "
" ], "text/plain": [ " birthyear weight height\n", "bob 1984.0 83.0 NaN\n", "alice 1985.0 68.0 NaN\n", "eugene NaN NaN NaN" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d2 = pd.DataFrame(\n", " people_dict,\n", " columns=[\"birthyear\", \"weight\", \"height\"],\n", " index=[\"bob\", \"alice\", \"eugene\"]\n", " )\n", "d2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another convenient way to create a `DataFrame` is to pass all the values to the constructor as an `ndarray`, or a list of lists, and specify the column names and row index labels separately:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843.0Dancing83
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3.0 Dancing 83\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "values = [\n", " [1985, np.nan, \"Biking\", 68],\n", " [1984, 3, \"Dancing\", 83],\n", " [1992, 0, np.nan, 112]\n", " ]\n", "d3 = pd.DataFrame(\n", " values,\n", " columns=[\"birthyear\", \"children\", \"hobby\", \"weight\"],\n", " index=[\"alice\", \"bob\", \"charles\"]\n", " )\n", "d3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To specify missing values, you can either use `np.nan` or NumPy's masked arrays:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843Dancing83
charles19920NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3 Dancing 83\n", "charles 1992 0 NaN 112" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "masked_array = np.ma.asarray(values, dtype=np.object)\n", "masked_array[(0, 2), (1, 2)] = np.ma.masked\n", "d3 = pd.DataFrame(\n", " masked_array,\n", " columns=[\"birthyear\", \"children\", \"hobby\", \"weight\"],\n", " index=[\"alice\", \"bob\", \"charles\"]\n", " )\n", "d3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of an `ndarray`, you can also pass a `DataFrame` object:" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbychildren
aliceBikingNaN
bobDancing3
\n", "
" ], "text/plain": [ " hobby children\n", "alice Biking NaN\n", "bob Dancing 3" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4 = pd.DataFrame(\n", " d3,\n", " columns=[\"hobby\", \"children\"],\n", " index=[\"alice\", \"bob\"]\n", " )\n", "d4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is also possible to create a `DataFrame` with a dictionary (or list) of dictionaries (or list):" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843.0Dancing83
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3.0 Dancing 83\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people = pd.DataFrame({\n", " \"birthyear\": {\"alice\":1985, \"bob\": 1984, \"charles\": 1992},\n", " \"hobby\": {\"alice\":\"Biking\", \"bob\": \"Dancing\"},\n", " \"weight\": {\"alice\":68, \"bob\": 83, \"charles\": 112},\n", " \"children\": {\"bob\": 3, \"charles\": 0}\n", "})\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multi-indexing\n", "If all columns are tuples of the same size, then they are understood as a multi-index. The same goes for row index labels. For example:" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
privatepublic
childrenweightbirthyearhobby
Londoncharles0.01121992NaN
ParisaliceNaN681985Biking
bob3.0831984Dancing
\n", "
" ], "text/plain": [ " private public \n", " children weight birthyear hobby\n", "London charles 0.0 112 1992 NaN\n", "Paris alice NaN 68 1985 Biking\n", " bob 3.0 83 1984 Dancing" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d5 = pd.DataFrame(\n", " {\n", " (\"public\", \"birthyear\"):\n", " {(\"Paris\",\"alice\"):1985, (\"Paris\",\"bob\"): 1984, (\"London\",\"charles\"): 1992},\n", " (\"public\", \"hobby\"):\n", " {(\"Paris\",\"alice\"):\"Biking\", (\"Paris\",\"bob\"): \"Dancing\"},\n", " (\"private\", \"weight\"):\n", " {(\"Paris\",\"alice\"):68, (\"Paris\",\"bob\"): 83, (\"London\",\"charles\"): 112},\n", " (\"private\", \"children\"):\n", " {(\"Paris\", \"alice\"):np.nan, (\"Paris\",\"bob\"): 3, (\"London\",\"charles\"): 0}\n", " }\n", ")\n", "d5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can now get a `DataFrame` containing all the `\"public\"` columns very simply:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearhobby
Londoncharles1992NaN
Parisalice1985Biking
bob1984Dancing
\n", "
" ], "text/plain": [ " birthyear hobby\n", "London charles 1992 NaN\n", "Paris alice 1985 Biking\n", " bob 1984 Dancing" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d5[\"public\"]" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "London charles NaN\n", "Paris alice Biking\n", " bob Dancing\n", "Name: (public, hobby), dtype: object" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d5[\"public\", \"hobby\"] # Same result as d5[\"public\"][\"hobby\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dropping a level\n", "Let's look at `d5` again:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
privatepublic
childrenweightbirthyearhobby
Londoncharles0.01121992NaN
ParisaliceNaN681985Biking
bob3.0831984Dancing
\n", "
" ], "text/plain": [ " private public \n", " children weight birthyear hobby\n", "London charles 0.0 112 1992 NaN\n", "Paris alice NaN 68 1985 Biking\n", " bob 3.0 83 1984 Dancing" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are two levels of columns, and two levels of indices. We can drop a column level by calling `droplevel()` (the same goes for indices):" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
childrenweightbirthyearhobby
Londoncharles0.01121992NaN
ParisaliceNaN681985Biking
bob3.0831984Dancing
\n", "
" ], "text/plain": [ " children weight birthyear hobby\n", "London charles 0.0 112 1992 NaN\n", "Paris alice NaN 68 1985 Biking\n", " bob 3.0 83 1984 Dancing" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d5.columns = d5.columns.droplevel(level = 0)\n", "d5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transposing\n", "You can swap columns and indices using the `T` attribute:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LondonParis
charlesalicebob
children0NaN3
weight1126883
birthyear199219851984
hobbyNaNBikingDancing
\n", "
" ], "text/plain": [ " London Paris \n", " charles alice bob\n", "children 0 NaN 3\n", "weight 112 68 83\n", "birthyear 1992 1985 1984\n", "hobby NaN Biking Dancing" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = d5.T\n", "d6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Stacking and unstacking levels\n", "Calling the `stack()` method will push the lowest column level after the lowest index:" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LondonParis
childrenbobNaN3
charles0NaN
weightaliceNaN68
bobNaN83
charles112NaN
birthyearaliceNaN1985
bobNaN1984
charles1992NaN
hobbyaliceNaNBiking
bobNaNDancing
\n", "
" ], "text/plain": [ " London Paris\n", "children bob NaN 3\n", " charles 0 NaN\n", "weight alice NaN 68\n", " bob NaN 83\n", " charles 112 NaN\n", "birthyear alice NaN 1985\n", " bob NaN 1984\n", " charles 1992 NaN\n", "hobby alice NaN Biking\n", " bob NaN Dancing" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d7 = d6.stack()\n", "d7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that many `NaN` values appeared. This makes sense because many new combinations did not exist before (eg. there was no `bob` in `London`).\n", "\n", "Calling `unstack()` will do the reverse, once again creating many `NaN` values." ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LondonParis
alicebobcharlesalicebobcharles
childrenNoneNaN0None3NaN
weightNaNNaN1126883NaN
birthyearNaNNaN199219851984NaN
hobbyNaNNaNNoneBikingDancingNone
\n", "
" ], "text/plain": [ " London Paris \n", " alice bob charles alice bob charles\n", "children None NaN 0 None 3 NaN\n", "weight NaN NaN 112 68 83 NaN\n", "birthyear NaN NaN 1992 1985 1984 NaN\n", "hobby NaN NaN None Biking Dancing None" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d8 = d7.unstack()\n", "d8" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we call `unstack` again, we end up with a `Series` object:" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "London alice children None\n", " weight NaN\n", " birthyear NaN\n", " hobby NaN\n", " bob children NaN\n", " weight NaN\n", " birthyear NaN\n", " hobby NaN\n", " charles children 0\n", " weight 112\n", " birthyear 1992\n", " hobby None\n", "Paris alice children None\n", " weight 68\n", " birthyear 1985\n", " hobby Biking\n", " bob children 3\n", " weight 83\n", " birthyear 1984\n", " hobby Dancing\n", " charles children NaN\n", " weight NaN\n", " birthyear NaN\n", " hobby None\n", "dtype: object" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d9 = d8.unstack()\n", "d9" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `stack()` and `unstack()` methods let you select the `level` to stack/unstack. You can even stack/unstack multiple levels at once:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LondonParis
alicebobcharlesalicebobcharles
childrenNoneNaN0None3NaN
weightNaNNaN1126883NaN
birthyearNaNNaN199219851984NaN
hobbyNaNNaNNoneBikingDancingNone
\n", "
" ], "text/plain": [ " London Paris \n", " alice bob charles alice bob charles\n", "children None NaN 0 None 3 NaN\n", "weight NaN NaN 112 68 83 NaN\n", "birthyear NaN NaN 1992 1985 1984 NaN\n", "hobby NaN NaN None Biking Dancing None" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d10 = d9.unstack(level = (0,1))\n", "d10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Most methods return modified copies\n", "As you may have noticed, the `stack()` and `unstack()` methods do not modify the object they apply to. Instead, they work on a copy and return that copy. This is true of most methods in pandas." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Accessing rows\n", "Let's go back to the `people` `DataFrame`:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843.0Dancing83
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3.0 Dancing 83\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `loc` attribute lets you access rows instead of columns. The result is a `Series` object in which the `DataFrame`'s column names are mapped to row index labels:" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "birthyear 1992\n", "children 0\n", "hobby NaN\n", "weight 112\n", "Name: charles, dtype: object" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.loc[\"charles\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also access rows by integer location using the `iloc` attribute:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "birthyear 1992\n", "children 0\n", "hobby NaN\n", "weight 112\n", "Name: charles, dtype: object" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.iloc[2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also get a slice of rows, and this returns a `DataFrame` object:" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
bob19843.0Dancing83
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "bob 1984 3.0 Dancing 83\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.iloc[1:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, you can pass a boolean array to get the matching rows:" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people[np.array([True, False, True])]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is most useful when combined with boolean expressions:" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843.0Dancing83
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3.0 Dancing 83" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people[people[\"birthyear\"] < 1990]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding and removing columns\n", "You can generally treat `DataFrame` objects like dictionaries of `Series`, so the following work fine:" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
birthyearchildrenhobbyweight
alice1985NaNBiking68
bob19843.0Dancing83
charles19920.0NaN112
\n", "
" ], "text/plain": [ " birthyear children hobby weight\n", "alice 1985 NaN Biking 68\n", "bob 1984 3.0 Dancing 83\n", "charles 1992 0.0 NaN 112" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyweightageover 30
aliceBiking6833True
bobDancing8334True
charlesNaN11226False
\n", "
" ], "text/plain": [ " hobby weight age over 30\n", "alice Biking 68 33 True\n", "bob Dancing 83 34 True\n", "charles NaN 112 26 False" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people[\"age\"] = 2018 - people[\"birthyear\"] # adds a new column \"age\"\n", "people[\"over 30\"] = people[\"age\"] > 30 # adds another column \"over 30\"\n", "birthyears = people.pop(\"birthyear\")\n", "del people[\"children\"]\n", "\n", "people" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice 1985\n", "bob 1984\n", "charles 1992\n", "Name: birthyear, dtype: int64" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "birthyears" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you add a new colum, it must have the same number of rows. Missing rows are filled with NaN, and extra rows are ignored:" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyweightageover 30pets
aliceBiking6833TrueNaN
bobDancing8334True0.0
charlesNaN11226False5.0
\n", "
" ], "text/plain": [ " hobby weight age over 30 pets\n", "alice Biking 68 33 True NaN\n", "bob Dancing 83 34 True 0.0\n", "charles NaN 112 26 False 5.0" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people[\"pets\"] = pd.Series({\"bob\": 0, \"charles\": 5, \"eugene\":1}) # alice is missing, eugene is ignored\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When adding a new column, it is added at the end (on the right) by default. You can also insert a column anywhere else using the `insert()` method:" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30pets
aliceBiking1726833TrueNaN
bobDancing1818334True0.0
charlesNaN18511226False5.0
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets\n", "alice Biking 172 68 33 True NaN\n", "bob Dancing 181 83 34 True 0.0\n", "charles NaN 185 112 26 False 5.0" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.insert(1, \"height\", [172, 181, 185])\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Assigning new columns\n", "You can also create new columns by calling the `assign()` method. Note that this returns a new `DataFrame` object, the original is not modified:" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_indexhas_pets
aliceBiking1726833TrueNaN22.985398False
bobDancing1818334True0.025.335002False
charlesNaN18511226False5.032.724617True
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index \\\n", "alice Biking 172 68 33 True NaN 22.985398 \n", "bob Dancing 181 83 34 True 0.0 25.335002 \n", "charles NaN 185 112 26 False 5.0 32.724617 \n", "\n", " has_pets \n", "alice False \n", "bob False \n", "charles True " ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.assign(\n", " body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2,\n", " has_pets = people[\"pets\"] > 0\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you cannot access columns created within the same assignment:" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Key error: 'body_mass_index'\n" ] } ], "source": [ "try:\n", " people.assign(\n", " body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2,\n", " overweight = people[\"body_mass_index\"] > 25\n", " )\n", "except KeyError as e:\n", " print(\"Key error:\", e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The solution is to split this assignment in two consecutive assignments:" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_indexoverweight
aliceBiking1726833TrueNaN22.985398False
bobDancing1818334True0.025.335002True
charlesNaN18511226False5.032.724617True
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index \\\n", "alice Biking 172 68 33 True NaN 22.985398 \n", "bob Dancing 181 83 34 True 0.0 25.335002 \n", "charles NaN 185 112 26 False 5.0 32.724617 \n", "\n", " overweight \n", "alice False \n", "bob True \n", "charles True " ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = people.assign(body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2)\n", "d6.assign(overweight = d6[\"body_mass_index\"] > 25)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Having to create a temporary variable `d6` is not very convenient. You may want to just chain the assigment calls, but it does not work because the `people` object is not actually modified by the first assignment:" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Key error: 'body_mass_index'\n" ] } ], "source": [ "try:\n", " (people\n", " .assign(body_mass_index = people[\"weight\"] / (people[\"height\"] / 100) ** 2)\n", " .assign(overweight = people[\"body_mass_index\"] > 25)\n", " )\n", "except KeyError as e:\n", " print(\"Key error:\", e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But fear not, there is a simple solution. You can pass a function to the `assign()` method (typically a `lambda` function), and this function will be called with the `DataFrame` as a parameter:" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_indexoverweight
aliceBiking1726833TrueNaN22.985398False
bobDancing1818334True0.025.335002True
charlesNaN18511226False5.032.724617True
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index \\\n", "alice Biking 172 68 33 True NaN 22.985398 \n", "bob Dancing 181 83 34 True 0.0 25.335002 \n", "charles NaN 185 112 26 False 5.0 32.724617 \n", "\n", " overweight \n", "alice False \n", "bob True \n", "charles True " ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(people\n", " .assign(body_mass_index = lambda df: df[\"weight\"] / (df[\"height\"] / 100) ** 2)\n", " .assign(overweight = lambda df: df[\"body_mass_index\"] > 25)\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Problem solved!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluating an expression\n", "A great feature supported by pandas is expression evaluation. This relies on the `numexpr` library which must be installed." ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice False\n", "bob True\n", "charles True\n", "dtype: bool" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.eval(\"weight / (height/100) ** 2 > 25\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Assignment expressions are also supported. Let's set `inplace=True` to directly modify the `DataFrame` rather than getting a modified copy:" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_index
aliceBiking1726833TrueNaN22.985398
bobDancing1818334True0.025.335002
charlesNaN18511226False5.032.724617
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index\n", "alice Biking 172 68 33 True NaN 22.985398\n", "bob Dancing 181 83 34 True 0.0 25.335002\n", "charles NaN 185 112 26 False 5.0 32.724617" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.eval(\"body_mass_index = weight / (height/100) ** 2\", inplace=True)\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can use a local or global variable in an expression by prefixing it with `'@'`:" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_indexoverweight
aliceBiking1726833TrueNaN22.985398False
bobDancing1818334True0.025.335002False
charlesNaN18511226False5.032.724617True
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index \\\n", "alice Biking 172 68 33 True NaN 22.985398 \n", "bob Dancing 181 83 34 True 0.0 25.335002 \n", "charles NaN 185 112 26 False 5.0 32.724617 \n", "\n", " overweight \n", "alice False \n", "bob False \n", "charles True " ] }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ "overweight_threshold = 30\n", "people.eval(\"overweight = body_mass_index > @overweight_threshold\", inplace=True)\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Querying a `DataFrame`\n", "The `query()` method lets you filter a `DataFrame` based on a query expression:" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_indexoverweight
bobDancing1818334True0.025.335002False
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index overweight\n", "bob Dancing 181 83 34 True 0.0 25.335002 False" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.query(\"age > 30 and pets == 0\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sorting a `DataFrame`\n", "You can sort a `DataFrame` by calling its `sort_index` method. By default it sorts the rows by their index label, in ascending order, but let's reverse the order:" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyheightweightageover 30petsbody_mass_indexoverweight
charlesNaN18511226False5.032.724617True
bobDancing1818334True0.025.335002False
aliceBiking1726833TrueNaN22.985398False
\n", "
" ], "text/plain": [ " hobby height weight age over 30 pets body_mass_index \\\n", "charles NaN 185 112 26 False 5.0 32.724617 \n", "bob Dancing 181 83 34 True 0.0 25.335002 \n", "alice Biking 172 68 33 True NaN 22.985398 \n", "\n", " overweight \n", "charles True \n", "bob False \n", "alice False " ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.sort_index(ascending=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `sort_index` returned a sorted *copy* of the `DataFrame`. To modify `people` directly, we can set the `inplace` argument to `True`. Also, we can sort the columns instead of the rows by setting `axis=1`:" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agebody_mass_indexheighthobbyover 30overweightpetsweight
alice3322.985398172BikingTrueFalseNaN68
bob3425.335002181DancingTrueFalse0.083
charles2632.724617185NaNFalseTrue5.0112
\n", "
" ], "text/plain": [ " age body_mass_index height hobby over 30 overweight pets \\\n", "alice 33 22.985398 172 Biking True False NaN \n", "bob 34 25.335002 181 Dancing True False 0.0 \n", "charles 26 32.724617 185 NaN False True 5.0 \n", "\n", " weight \n", "alice 68 \n", "bob 83 \n", "charles 112 " ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.sort_index(axis=1, inplace=True)\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To sort the `DataFrame` by the values instead of the labels, we can use `sort_values` and specify the column to sort by:" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
agebody_mass_indexheighthobbyover 30overweightpetsweight
charles2632.724617185NaNFalseTrue5.0112
alice3322.985398172BikingTrueFalseNaN68
bob3425.335002181DancingTrueFalse0.083
\n", "
" ], "text/plain": [ " age body_mass_index height hobby over 30 overweight pets \\\n", "charles 26 32.724617 185 NaN False True 5.0 \n", "alice 33 22.985398 172 Biking True False NaN \n", "bob 34 25.335002 181 Dancing True False 0.0 \n", "\n", " weight \n", "charles 112 \n", "alice 68 \n", "bob 83 " ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "people.sort_values(by=\"age\", inplace=True)\n", "people" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting a `DataFrame`\n", "Just like for `Series`, pandas makes it easy to draw nice graphs based on a `DataFrame`.\n", "\n", "For example, it is trivial to create a line plot from a `DataFrame`'s data by calling its `plot` method:" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAELCAYAAADX3k30AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAIABJREFUeJzt3XuUVNWZ9/Hv03f6Agg0qDTYrXLV\nIGqLoiEajRN1HC/zBi/xnSRqZBJNMuPEd8Y471JnMheSmDhmuSa+GghxjcF7ovEyMUYNRke0QRSB\nVlBRGxBaFLk0dNPdz/vHOd11qvpW9K2qT/8+a51VVfvsqnqqxWefs/c5e5u7IyIi8ZWT6QBERGRg\nKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMScEr2ISMzlZToAgHHjxnllZWWm\nwxARGVJWrFjxkbuX91QvKxJ9ZWUlNTU1mQ5DRGRIMbP30qmnrhsRkZhTohcRiTklehGRmFOiFxGJ\nOSV6EZGYU6IXEYk5JXoRkZjLiuvoP97TxCOrNlFSkEdJYR4lhbmUFOZRWhi8Ls7PJSfHMh2miMiQ\nlBWJftOOvfzNvau6rVNckEj+0eclhXmUhK+DslyKCyL7CnPbG5DSwjyKw9e5ajhEZJjIikQ//eCR\nPPzdU9nT2MyexpbgsamZ3Y3NNDS2sLuxub2sbf/uxma27dpHw0fR/S1pf+eI/NykhqA0fF5cmEdp\nQaTRiDQgiTOO8HVhXnujooZDRLJVViT6/FzjiPLSPn9Oa6uzd3+iIdjT2BI2DonXDU3N7Q3D7vB1\n2/6Pdjex5+OGRIPT1Ix7et9dlJ8Tnm10dmYRNioFnTcSJYW54dlGWyOTS16uhk9EpH9kRaLvLzk5\n1n7EPb4fPs89aDh2R8802s82WmjopEFpa0D2NDazo6GJuk8aks5SWtNsOArzcjqccRR3OLOIdmcl\nGpBEo5Lb3oWVr4ZDZNiKVaLvb2ZGcUGQRCnr++e5O/v2t3bZFdXQlGgkovvb6n+6dz+bd+yN7G+h\nJc2WoyA3J6Vh6G6cI9ifNM4RNh5t4xyFebl9/4OIyKDoMdGb2WLgXGCbux8dls0G7gCKgGbgand/\n2cwMuA04B2gAvubuKwcq+KHGzBhRkMuIglzKywr7/HnuTmNza/JYRmRsI9FlFTQKiTOOZhqaWti1\nr5kPP92XtL85zYYjP9eSzhzaG42CoCEoTWk0OmtU2q+qKsilMC+H4J+PiPS3dI7olwC3A3dHyn4I\n/JO7P2lm54SvTwPOBqaE24nAz8JHGQBmRlF+LkX5udD3IY72hmNP2BDsTmkY2huNSMPQVta2f+vO\nfUldWftb0ms48tq63bo9s+ikkWhrVCJnG6WFeWo4JPb+4+m30q7bY6J392VmVplaDIwMn48CNofP\nzwfudncHXjKz0WZ2iLtvSTsiyZhowzG2nz6zsbkl6WwjemVV52cbyQPm9bsak97X1NKa1vfm5lhS\no9HpAHlBV/s6XpZblK+GQ7LLe9sb0q7b2z76vwV+Z2a3ENxde3JYPhH4IFKvLixToh+mCvOC/vyD\nSgr65fOamlsjDUFL+9hGdBA82jCkjn1s392QNPbR1Jxew5FjJA2Cp15hlToInnQZbqRBaXvviPxc\nNRzSJ7dePJv/uCS9ur1N9N8ErnX3h8zsImAR8IUD+QAzWwAsAJg8eXIvw5DhpiAvh4K8AkYX90/D\nsb+lNTjjiFw1FR0AjzYSiUYlsX9TZHB8d2MzjWk2HNbecOQmNRCJcYtEA5JoVCLdWJHXxbp7XHrQ\n20T/VeBvwucPAD8Pn28CJkXqVYRlHbj7ncCdANXV1WledCjSv/JzcxhVnMOo4vx++bzmltakrqjU\nsYykfW0NSNjINDS2sHnHvqQzkr3707sJ0AyK83OTzhzSuSy3y8Hzgjw1HDHS20S/GTgVeA44HVgf\nlj8KfMvM7iUYhP1U/fMynOTl5jBqRA6jRvRPw9HS6h3GNtIdIN/T1MKHO/cljX00HMDd421XSyWN\ndRzQZbnJZyy6ezxz0rm8cinBFTXjzKwOuAm4CrjNzPKAfYRdMMATBJdWbiC4vPLyAYhZZNjIzTFG\nFuUzsqj/Go627qfu7hpPnookcXNg/e5GNm5vSGpc0hVMO9LxnoyS8I7w6JlFT41KSYHuHj8Q6Vx1\nc2kXu47vpK4D1/Q1KBEZGLk5RllRPmX91HC0tjoN+5PvEk8MgCc3GNEB87b923c38f72xAD5gUw7\nUpiX003DkHwVVdtVVdHJDdv3hY1MnO8e152xItJrOTlGaZg8+2Pakfb5qpqSu6r2pN41HjYKwc2B\niX2f9GHakYL2hqO7S2/TnLeqII+CvOxpOJToRSRrROerGohpR9rGNnq6a7x92pGGJjbvSK6fdsMR\nTjvSUyORejZSHN0XGUDvy7QjSvQiElsDOe1I6plF25VT7Y1KU8pUJE3NSdOO7D7A+ao6m3YkXUr0\nIiJpit49Pq60/xqOaKPR2TTrnd01vqcx/YFwJXoRkQxJmnakF/NV3XNVevWyZ7RAREQGhBK9iEjM\nKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnR\ni4jEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjM9ZjozWyxmW0zszdSyr9t\nZrVmtsbMfhgp/56ZbTCzN83siwMRtIiIpC8vjTpLgNuBu9sKzOzzwPnAMe7eaGbjw/KZwCXAUcCh\nwNNmNtXdW/o7cBERSU+PR/Tuvgz4OKX4m8BCd28M62wLy88H7nX3Rnd/F9gAzOnHeEVE5AD1to9+\nKjDPzJab2R/N7ISwfCLwQaReXVgmIiIZkk7XTVfvGwOcBJwA3G9mhx/IB5jZAmABwOTJk3sZhoiI\n9KS3R/R1wMMeeBloBcYBm4BJkXoVYVkH7n6nu1e7e3V5eXkvwxARkZ70NtH/Bvg8gJlNBQqAj4BH\ngUvMrNDMqoApwMv9EaiIiPROj103ZrYUOA0YZ2Z1wE3AYmBxeMllE/BVd3dgjZndD6wFmoFrdMWN\niEhmWZCfM6u6utpramoyHYaIyJBiZivcvbqnerozVkQk5pToRURiToleRCTmlOhFRGJOiV5EJOaU\n6EVEYk6JXkQk5pToRURiToleRCTmlOhFRGJOiV5EJOaU6EVEYk6JXkQk5pToRURiToleRCTmlOhF\nRGJOiV5EJOaU6EVEYk6JXkQk5pToRURiToleRCTmlOhFRGJOiV5EJOaU6EVEYk6JXkQk5npM9Ga2\n2My2mdkbnez7rpm5mY0LX5uZ/dTMNpjZ62Z23EAELSIi6UvniH4JcFZqoZlNAv4MeD9SfDYwJdwW\nAD/re4giItIXPSZ6d18GfNzJrluBvwc8UnY+cLcHXgJGm9kh/RKpiIj0Sq/66M3sfGCTu7+Wsmsi\n8EHkdV1Y1tlnLDCzGjOrqa+v700YIiKShgNO9GZWDNwA3NiXL3b3O9292t2ry8vL+/JRIiLSjbxe\nvOcIoAp4zcwAKoCVZjYH2ARMitStCMtERCRDDviI3t1Xu/t4d69090qC7pnj3P1D4FHgK+HVNycB\nn7r7lv4NWUREDkQ6l1cuBf4HmGZmdWZ2ZTfVnwDeATYAdwFX90uUIiLSaz123bj7pT3sr4w8d+Ca\nvoclIiL9RXfGiojEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMSc\nEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9\niEjMKdGLiMScEr2ISMwp0YuIxJwSvYhIzPWY6M1ssZltM7M3ImU/MrNaM3vdzH5tZqMj+75nZhvM\n7E0z++JABS4iIulJ54h+CXBWStnvgaPdfRbwFvA9ADObCVwCHBW+5z/NLLffohURkQPWY6J392XA\nxyllT7l7c/jyJaAifH4+cK+7N7r7u8AGYE4/xisiIgeoP/rorwCeDJ9PBD6I7KsLy0REJEP6lOjN\n7B+BZuCeXrx3gZnVmFlNfX19X8IQEZFu9DrRm9nXgHOBy9zdw+JNwKRItYqwrAN3v9Pdq929ury8\nvLdhiIhID3qV6M3sLODvgfPcvSGy61HgEjMrNLMqYArwct/DFBGR3srrqYKZLQVOA8aZWR1wE8FV\nNoXA780M4CV3/4a7rzGz+4G1BF0617h7y0AFLyIiPbNEr0vmVFdXe01NTabDEBEZUsxshbtX91RP\nd8aKiMScEr2ISMwp0YuIxJwSvYhIzCnRi4jEnBK9iEjMKdGLiMRcdiR6b810BCIisdXjnbGDYstr\n8B+fgfIZMH564nHcNCgoznR0IiJDWnYk+rJDoOIE2FYL7zwLLU3hDoODKmH8DCifnngcNxXyizIZ\nsYjIkJElif5g+NLi4HlLM3z8DtSvCxJ//TrYtg7WPwWt4VonlgMHVQWJP9oIjD0S8goz9ztERLJQ\ndiT6qNw8KJ8abDPPT5Q3N8HHbwdJv7428fjmk9A2b5rlwtgjEol//IygG2jsEZCbn5nfIyKSYdmX\n6LuSV5BI3lHNjfDR+uTkv3UN1D6WGOTNyQ+O9qP9/+UzYMzhQcMiIhJjQz/L5RXCwUcHW9T+vUED\nsG1dohto86uw5jdAOGNnbkHQ318+PdIIzAjGBXK0prmIxMPQT/RdyR8Bh8wKtqimBvjozeT+/w9e\nhjceTNTJK4JxU2D8zORB4NGHQU52XJEqIpKu+Cb6rhQUw6HHBltU4y6ofyuR/OtrYeML8Pp9iTr5\nxcEZQLT/f/x0GDUJggVYRESyzvBL9F0pLIOK44Mtat+nUP9m8iDwO8/Ba0sTdQpKoXxax/sARk5U\nAyAiGadE35OiUTBpTrBF7f0kbADWJrqB1j8Fq/4rUadwZEr/f/hYdrAaABEZNEr0vTXiIJh8UrBF\nNXycPAC8bR3UPg4r707UKRrd8Saw8TOgpFwNgIj0OyX6/lY8BipPCbao3fUpN4HVwppfw4pfJOqM\nGJOc+NvGAUrGDu5vEJFYUaIfLKXlwVb1uUSZO+ze2vEmsNUPQuOniXol5R2P/sunB42KiEgPlOgz\nySzory87GI74fKLcHXZtSe7/31YLq34FTbsT9UoP7tj/P356MK4gIhJSos9GZjDy0GA78guJcnf4\ntC5x9N82FrDyl7C/IVFv5MROzgCmBVcWiciwo0Q/lJjB6EnBNuXMRHlrK3z6fvLR/7a18MoL0Lwv\nUW/UpJT+/+lBA1BQMvi/RUQGjRJ9HOTkBNM2HFQJ085KlLe2wCcbk/v/t9XCu8ugpTGsZDB6csdB\n4HFTg7uLRWTI6zHRm9li4Fxgm7sfHZaNAe4DKoGNwEXu/omZGXAbcA7QAHzN3VcOTOjSo5xwNs+x\nR8D0P0+UtzTDJ+92HATe8Ado3R/UsbDxSO3/HztFawGIDDHpHNEvAW4HIheCcz3wB3dfaGbXh6//\nATgbmBJuJwI/Cx8lm+TmBXP5jJsCnJcob9kfrAUQ7f/fVgvrf5e8FsCYIzoOAo89MphhVESyTo+J\n3t2XmVllSvH5wGnh818CzxEk+vOBu93dgZfMbLSZHeLuW/orYBlAufnhVA7T4KgLEuXNTbB9Q8fF\nYGofj0wFnZdoAKKTwY05XGsBiGRYb/voJ0SS94fAhPD5ROCDSL26sEyJfijLK4AJM4Mtav8+2L4+\neRD4w9Ww9lHap4LOyQ/OHFJvAhtTpamgRQZJnwdj3d3NzA/0fWa2AFgAMHny5L6GIZmQXwQHfybY\nopoa4KO3kvv/N62ANQ8n6uQWhjOBTk++FFRrAYj0u94m+q1tXTJmdgiwLSzfBEyK1KsIyzpw9zuB\nOwGqq6sPuKGQLFZQDIfODraopj3BRHD1tYmbwd5/CVY/kKiTNyJcSjJlEHjUZK0FINJLvU30jwJf\nBRaGj49Eyr9lZvcSDMJ+qv55aVdQAhOPC7aoxl0pU0GvDS4Bff3eRJ38kqABSF0MZlSFJoIT6UE6\nl1cuJRh4HWdmdcBNBAn+fjO7EngPuCis/gTBpZUbCC6vvHwAYpa4KSyDiupgi9q7IzwDiAwCb/gD\nrLonUaegLBg8ji4FOX4GlB2iBkAkZMEFMplVXV3tNTU1mQ5DhoqGj1NuAgsf99Qn6hSO6tj/P34G\nlE5QAyCxYWYr3L26p3q6M1aGnuIxcNjJwRa1Z3vi0s+25L/ut8FcQG1GHNSx/798RjCzqEhMKdFL\nfJSMhZLPQuVnE2XuwZF+9Oh/2zp446Fgmcg2xWMjXT+RbiBNBS0xoEQv8WYGpeOD7fBTE+XusOvD\njovBvH4fNO5M1CsZ37H/v3w6jBg9+L9FpJeU6GV4MoORhwTbEacnyt1h56bk5F+/LhgAjq4FUHZI\n54vBFI0c/N8i0gMlepEos+CSzVEVMCWyFkBrK+ysS+7/37YOan4BzXsT9UZWpAwCt60FUDr4v0Uk\npEQvko6cnGA659GTYeoXE+WtrbDjvY6Lwbz7fGQqaIIbvsanDAKPmxbcXCYywJToRfoiJyeYt2dM\nFUw7O1HethZAdBbQ+lp451loaQorGRx0WMf+/3FTNRW09CslepGBEF0LYMa5ifKW5mAq6NRB4A2/\nT54K+qCqjovBjD0S8goz83tkSFOiFxlMuXnhXD5TYeb5ifLmJvj47Y43gb35JHhLUMfCxiN1EHjs\nkZoKWrqlRC+SDfIKEkfuUc2NwVoA0QZg6xqofSx5LYCxUzreBDbm8KBhkWFP/wpEslleIUw4Ktii\n9u/rOBX05ldhzW9oXwsgtyDRALRdATR+hqaCHoaU6EWGovwiOGRWsEU1NcBHbyb3/9e9EtwJ3Cav\nKFwMJuUu4NGHaSromFKiF4mTgmI49Nhgi2rcHZkJNDwDeO9FWH1/ok5+cbgYTMpNYKMmqQEY4pTo\nRYaDwlKoOD7YovbtTG4Atq2Dd56D15Ym6hSUhmsJp9wHMHKiZgIdIpToRYazopEw6YRgi9r7ScfF\nYNY/Bav+K1GncGS4FsCM5Eag7GA1AFlGiV5EOhpxEEw+KdiiGj7ueBNY7ROw8u5EnaJRHfv/x8+A\nknI1ABmiRC8i6SseA5WnBFvU7vqON4GtfQT2LknUGTGmY///+BlQMm5Qf8JwpEQvIn1XWh5sVZ9L\nlLnD7m0dF4NZ/SA0RtYCKCnvfCZQrQXQb5ToRWRgmEHZhGA7/LREuTvs2tJxMZhVS6FpV6Je6YQw\n8c9MHgQuGjXYv2TIU6IXkcFlBiMPDbYjz0iUu8OndR3XA155N+zfk6hXdmhy4h8/M5wKumzwf8sQ\noUQvItnBDEZPCrYpZybKW1vh0/c7LgZTszh5LYBRk8IzgOhUENOhoGTwf0uWydpEv3//furq6ti3\nb1+mQ8kaRUVFVFRUkJ+vCaxkGMnJCaZtOKgSpp2VKG9tCdYC2BZe/llfGzx/d1nyWgCjD+vY/18+\nDfJHDPYvyZisTfR1dXWUlZVRWVmJ6ZIs3J3t27dTV1dHVVVVpsMRybyc3GDitjGHw/RzEuUtzcFa\nAG1H/22NwIY/QOv+sJIFDUdq///YKUNnLYCdW9KumrWJft++fUryEWbG2LFjqa+vz3QoItktNw/G\nHRlsM/4iUd6yP1gLIHUq6PW/S14LYMzhyesAlLetBVCQmd/TlV//ddpVszbRA0ryKfT3EOmD3Pxw\nKodpyeXNTcFU0Kn3AUTXAsjJgzFHdJwKeuwRmVsLYO41wG/TqtqnRG9m1wJfJ5gXdTVwOXAIcC8w\nFlgB/JW7N3X5IVls48aNnHvuubzxxhtp1b/jjjsoLi7mK1/5Spd1lixZQk1NDbfffnuHff/2b//G\nDTfc0Ot4RaQX8gpgwsxgi2puhI/WJ98J/OFqWPso7VNB5+SHM4Gm3AdwUNXArwUQXbu4B72OxMwm\nAt8BZrr7XjO7H7gEOAe41d3vNbM7gCuBn/X2e4aSb3zjG316vxK9SBbJK4SDjw62qP17g7UAov3/\nm1bAmocTdXILw5lApyc3AhlaC6CvTU4eMMLM9gPFwBbgdODL4f5fAjczhBN9S0sLV111FS+++CIT\nJ07kkUceYfPmzVxzzTXU19dTXFzMXXfdxfTp07n55pspLS3luuuu45VXXuHKK68kJyeHM888kyef\nfLL9zGDz5s2cddZZvP3221x44YX88Ic/5Prrr2fv3r3Mnj2bo446invuuSfDv1xEOpU/Ag45Jtii\nmvaEM4FG+v/fXw6rH0jUyStKmQo6HAweNXlAp4LudaJ3901mdgvwPrAXeIqgq2aHu4cjG9QBE/sa\n5D/9dg1rN+/s68ckmXnoSG76i6N6rLd+/XqWLl3KXXfdxUUXXcRDDz3EL37xC+644w6mTJnC8uXL\nufrqq3nmmWeS3nf55Zdz1113MXfuXK6//vqkfatWreLVV1+lsLCQadOm8e1vf5uFCxdy++23s2rV\nqn79nSIySApKYOJxwRbVuCtlJtB1sPFP8Pp9iTr5JeFawimTwY2q6JeJ4PrSdXMQcD5QBewAHgDO\n6vZNye9fACwAmDx5cm/DGHBVVVXMnj0bgOOPP56NGzfy4osvMn/+/PY6jY2NSe/ZsWMHu3btYu7c\nuQB8+ctf5rHHHmvff8YZZzBqVHAb98yZM3nvvfeYNGnSQP8UEcmEwjKoqA62qL07ImsBhIPAbz8D\nr/0qUaegLJwKOmUQeOShB9QA9KXr5gvAu+5eD2BmDwOnAKPNLC88qq8ANnX2Zne/E7gToLq62rv7\nonSOvAdKYWFh+/Pc3Fy2bt3K6NGj+3TknfqZzc3N3dQWkVgaMRomnxhsUQ0fd1wM5q3fwavRtQBG\nBUk/TX1J9O8DJ5lZMUHXzRlADfAs8CWCK2++CjzSh+/IOiNHjqSqqooHHniA+fPn4+68/vrrHHNM\nor9u9OjRlJWVsXz5ck488UTuvffetD47Pz+f/fv3685XkeGseAwcNjfYovZsT14Kcltt2h/Z695/\nd18OPAisJLi0MofgCP0fgL8zsw0El1gu6u13ZKt77rmHRYsWccwxx3DUUUfxyCMd27JFixZx1VVX\nMXv2bPbs2dPeVdOdBQsWMGvWLC677LKBCFtEhrKSsVD5WZhzFfz5j+Hyx9N+q7l322syKKqrq72m\npiapbN26dcyYMSNDEfXd7t27KS0tBWDhwoVs2bKF2267rc+fO9T/LiLSf8xshbtX91Qvq++MHcoe\nf/xx/v3f/53m5mYOO+wwlixZkumQRGSYUqIfIBdffDEXX3xxpsMQEel9H72IiAwNSvQiIjGnRC8i\nEnNK9CIiMadEPwC+/vWvs3bt2m7rfO1rX+PBBx/sUL5x40Z+9atfdfIOEZHeUaIfAD//+c+ZOXNm\nzxU7oUQvIv1Nib4bP/rRj/jpT38KwLXXXsvpp58OwDPPPMNll13GU089xdy5cznuuOOYP38+u3fv\nBuC0006j7QawRYsWMXXqVObMmcNVV13Ft771rfbPX7ZsGSeffDKHH354+9H99ddfz/PPP8/s2bO5\n9dZbB/PnikhMDY3r6J+8PljZpT8d/Bk4e2G3VebNm8ePf/xjvvOd71BTU0NjYyP79+/n+eefZ9as\nWfzLv/wLTz/9NCUlJfzgBz/gJz/5CTfeeGP7+zdv3sz3v/99Vq5cSVlZGaeffnrSnDhbtmzhT3/6\nE7W1tZx33nl86UtfYuHChdxyyy1Js12KiPTF0Ej0GXL88cezYsUKdu7cSWFhIccddxw1NTU8//zz\nnHfeeaxdu5ZTTjkFgKampvZpidu8/PLLnHrqqYwZMwaA+fPn89Zbb7Xvv+CCC8jJyWHmzJls3bp1\n8H6YiAwrQyPR93DkPVDy8/OpqqpiyZIlnHzyycyaNYtnn32WDRs2UFVVxZlnnsnSpUt7/fnR6Yqz\nYc4hEYkn9dH3YN68edxyyy187nOfY968edxxxx0ce+yxnHTSSbzwwgts2LABgD179iQdrQOccMIJ\n/PGPf+STTz6hubmZhx56qMfvKysrY9euXQPyW0RkeFKi78G8efPYsmULc+fOZcKECRQVFTFv3jzK\ny8tZsmQJl156KbNmzWLu3LnU1ibPDz1x4kRuuOEG5syZwymnnEJlZWWP0xXPmjWL3NxcjjnmGA3G\niki/0DTFA6xtuuLm5mYuvPBCrrjiCi688MJef15c/i4i0nfpTlOsI/oBdvPNNzN79myOPvpoqqqq\nuOCCCzIdkogMM0NjMHYIu+WWWzIdgogMczqiFxGJuaxO9NkwfpBN9PcQkd7I2kRfVFTE9u3bldxC\n7s727dspKirKdCgiMsRkbR99RUUFdXV11NfXZzqUrFFUVERFRUWmwxCRISZrE33bXakiItI3Wdt1\nIyIi/UOJXkQk5pToRURiLiumQDCzeuC9DIcxDvgowzEcqKEYMwzNuBXz4BmKcWcq5sPcvbynSlmR\n6LOBmdWkM2dENhmKMcPQjFsxD56hGHe2x6yuGxGRmFOiFxGJOSX6hDszHUAvDMWYYWjGrZgHz1CM\nO6tjVh+9iEjM6YheRCTmhl2iN7MiM3vZzF4zszVm9k9h+T1m9qaZvWFmi80sP9OxRnUT97fMbIOZ\nuZmNy3ScUd3EXGVmy8O47zOzgkzH2sbMJpnZs2a2Noz5b8LyY8zsf8xstZn91sxGZjrWqG7inm1m\nL5nZKjOrMbM5mY61TTcx3xfGu8rMNprZqkzH2qarmMN93zaz2rD8h5mMswN3H1YbYEBp+DwfWA6c\nBJwT7jNgKfDNTMeaZtzHApXARmBcpuNMM+b7gUvC8juy6W8NHAIcFz4vA94CZgKvAKeG5VcA3890\nrGnG/RRwdlh+DvBcpmPtKeaUOj8Gbsx0rGn8nT8PPA0UhvvGZzrW6Dbsjug9sDt8mR9u7u5PhPsc\neBnIqmkiu4n7VXffmLnIutZVzMDpwINh+S+BrFlf0d23uPvK8PkuYB0wEZgKLAur/R74X5mJsHPd\nxO1A29nHKGBzZiLsqJuYATAzAy4iOPDKCt3E/E1gobs3hvu2ZS7KjoZdogcws9zwdHAb8Ht3Xx7Z\nlw/8FfDfmYqvK93Fna1SYwbcHM6xAAAGHklEQVTeBna4e3NYpY7I/9zZxMwqCc6YlgNrgPPDXfOB\nSZmJqmcpcf8t8CMz+wC4Bfhe5iLrWkrMbeYBW919fSZi6klKzFOBeWGX5B/N7IRMxpZqWCZ6d29x\n99kER+1zzOzoyO7/BJa5+/OZia5rPcSdlVJjBqZnOKS0mFkp8BDwt+6+k6C75mozW0Fwyt6Uyfi6\n0knc3wSudfdJwLXAokzG15lOYm5zKVl0NB/VScx5wBiCrsn/A9wfnpFkhWGZ6Nu4+w7gWeAsADO7\nCSgH/i6TcfUkNe6hIBLzXGC0mbWthVABbMpYYJ0Iz+oeAu5x94cB3L3W3f/M3Y8nSD5vZzLGznQW\nN/BVoO35AwSNbdboImbCfx9/CdyXqdi60kXMdcDDYXfly0Arwfw3WWHYJXozKzez0eHzEcCZQK2Z\nfR34InCpu7dmMsbOdBV3ZqPqXhcxryNI+F8Kq30VeCQzEXYUHoUtAta5+08i5ePDxxzg/xIMImeN\nruIm6JM/NXx+OpA13SDdxAzwBaDW3esGP7KudRPzbwgGZDGzqUABWTQx27C7YcrMZhEMAOYSNHT3\nu/s/m1kzwQyau8KqD7v7P2cozA66ifs7wN8DBxP0gz/h7l/PXKQJ3cR8OHAvwanuq8D/bhvEyjQz\n+yzwPLCa4KgM4AZgCnBN+Pph4HueRf/zdBP3TuA2gq6FfcDV7r4iI0Gm6Cpmd3/CzJYAL7l7tjWo\nXf2dnwYWA7MJuvWuc/dnMhJkJ4ZdohcRGW6GXdeNiMhwo0QvIhJzSvQiIjGnRC8iEnNK9CIiMadE\nLyISc0r0MujMrNLM3ujle08zs8f6O6aBZGbVZvbTA3zPzWZ23UDFJMNLXs9VRKQv3L0GqMl0HDJ8\n6YheMiXPgsVe1pnZg2ZWbGZnmNmr4eIei82sEMDMzgoXdFhJMP8JZpZjZuvNrDzyekPb61RmtsTM\nfhYuwvFOeGawOPz+JZF6PwsX6GhfKCUsXxguNvG6md0Sls23YKGa18xsWSdf2/be9rOQ8Eh9sZk9\nF8bxnUi9fzSzt8zsT8C0SPkRZvbfZrbCzJ43s+lh+SNm9pXw+V+b2T0H/F9BhodMT4ivbfhtBAul\nOHBK+HoxwfwxHwBTw7K7CabYLQrLpxAsZHI/8FhY5yaC2QMB/gx4qJvvXEIw7YIRTDe8E/gMwcHO\nCmB2WG9M+JgLPAfMAsYCb5K4k3x0+LgamBgt6+K7T4vEfDPwIlBIMOnVdoJ5+o8PP6+YYP74DQS3\n0QP8AZgSPj8ReCZ8PiGsN49gAYwxmf5vqy07Nx3RS6Z84O4vhM//CzgDeNfd3wrLfgl8jmBa43fd\nfb27e1i3zWLgK+HzK4Bf9PCdvw0/YzXBPOerPZjAbg1B4wNwUXjm8CpwFMHqQZ8SzBOzyMz+EmgI\n674ALDGzqwgahnQ97u6N7v4RwfxEEwiS9a/dvcGDaW8fhfbpcE8GHgjn9f9/BKsc4e5bgRsJJon7\nrrt/fAAxyDCiRC+ZkjrJ0o4D/gD3D4CtZnY6wfS7T/bwlraJ01ojz9te55lZFXAdcIa7zwIeB4o8\nWCRlDsGqWOcSLkrj7t8gOBOZBKwws7Fphh797ha6HyvLIVioZXZkmxHZ/xmCs4JD0/xuGYaU6CVT\nJpvZ3PD5lwkGKyvN7Miw7K+APxJMxVxpZkeE5ZemfM7PCY7yH3D3lj7GNBLYA3xqZhOAs6H9qHqU\nuz9BsHjHMWH5Ee6+3N1vBOrp26pTy4ALzGyEmZUBfwEQHt2/a2bzw+80M2v7/jlhjMcC14UNlUgH\nSvSSKW8C15jZOuAg4FbgcoIuirYpYO9w933AAuDxsEsldS3OR4FSeu626ZG7v0bQZVML/IqgawaC\nFaUeM7PXgT+RWJjmR+HA8RsE/e6v9eG7VxIssvEawZnJK5HdlwFXmtlrhEsahgPVdwFXuPtm4LvA\n4nC+dJEkmqZYhjQzqwZudfd5mY5FJFvpOnoZsszseoI1US/LdCwi2UxH9BIrZvaPwPyU4gfc/V8H\n4bu/CPwgpfhdd79woL9bpDtK9CIiMafBWBGRmFOiFxGJOSV6EZGYU6IXEYk5JXoRkZj7/4PeaFpA\nYojcAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "people.plot(kind = \"line\", x = \"body_mass_index\", y = [\"height\", \"weight\"])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can pass extra arguments supported by matplotlib's functions. For example, we can create scatterplot and pass it a list of sizes using the `s` argument of matplotlib's `scatter()` function:" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEKCAYAAAAIO8L1AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAAFYVJREFUeJzt3XuYXHWd5/H3N+mmcyFyTcIlQMLV\nXRBZaJDh4iAMIo4zMIyDMCqXzYqOOqu4zgPuM7P47LPjo4zuuM5tB0duOw6iDA6suugsyuKqIB2F\nEBBMnBAIBBJuCdeQpL/7R50sReeX7qrQ1acq/X49Tz1V9atTfT7dpPvD+Z1T50RmIknSSFPqDiBJ\n6k4WhCSpyIKQJBVZEJKkIgtCklRkQUiSiiwISVKRBSFJKrIgJElFfXUHeD123333nD9/ft0xJKmn\nLFq06MnMnD3Wcj1dEPPnz2doaKjuGJLUUyJiRSvLOcUkSSqyICRJRRaEJKnIgpAkFfX0TmpJmmx+\n+cRzPPj4c+y76wwOn7cTEdGxdVkQktQDXli/kQ9cO8TPHn6GvinBcMJ+u83g2n/7FmbPGujIOp1i\nkqQe8Cc3LWHRimd4ecMwz6/fxIuvbGLpE8/z4a8u6tg6LQhJ6nIvb9jEtxavYv3G4deMbxxOFq9c\ny8pnXuzIei0ISepyz6/fuNXX+qdO4annX+nIei0ISepyu87YgZ2m9Rdf2zg8zIFzduzIei0ISepy\nU6YEn3rnG5ne/9o/2dP7p/IHv34AMwc6c7yRRzFJUg8468h5zBzo48+++yArnnqBObOm8dGTD+Sc\no/fp2DotCEnqEacdugenHbrHhK3PKSZJUlHHCiIiroyI1RGxpGns9yLivogYjojBEct/KiKWRcSD\nEXFap3JJklrTyS2Iq4F3jBhbApwF3N48GBH/GjgHOLR6z19HxNQOZpMkjaFjBZGZtwNPjxj7RWY+\nWFj8DOBrmbk+M5cDy4BjOpVNkjS2btkHsTfwSNPzldXYFiLioogYioihNWvWTEg4SZqMuqUgWpaZ\nV2TmYGYOzp495iVVJUnbqFsK4lGg+WDeedWYJKkm3VIQNwPnRMRARCwADgJ+WnMmSZrUOvZBuYi4\nDjgJ2D0iVgKX0dhp/RfAbODbEXF3Zp6WmfdFxNeB+4GNwEcyc1OnskmSxtaxgsjMc7fy0je3svyf\nAn/aqTySpPZ0yxSTJKnLWBCSpCILQpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnIgpAkFVkQkqQi\nC0KSVGRBSJKKLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkIgtCklRkQUiSiiwISVKRBSFJKrIg\nJElFFoQkqciCkCQVWRCSpCILQpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnIgpAkFVkQkqQiC0KS\nVGRBSJKKOlYQEXFlRKyOiCVNY7tGxD9HxNLqfpdqPCLiSxGxLCIWR8SRncolSWpNJ7cgrgbeMWLs\nUuDWzDwIuLV6DnA6cFB1uwj4mw7mkiS1oGMFkZm3A0+PGD4DuKZ6fA1wZtP4tdlwB7BzROzZqWyS\npLFN9D6IuZm5qnr8ODC3erw38EjTciursS1ExEURMRQRQ2vWrOlcUkma5GrbSZ2ZCeQ2vO+KzBzM\nzMHZs2d3IJkkCSa+IJ7YPHVU3a+uxh8F9mlabl41JkmqyUQXxM3A+dXj84GbmsbPq45mOhZY2zQV\nJUmqQV+nvnBEXAecBOweESuBy4DPAl+PiIXACuDsavHvAO8ElgEvAhd2KpckqTUdK4jMPHcrL51S\nWDaBj3QqiySpfX6SWpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnIgpAkFVkQkqQiC0KSVGRBSJKK\nLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkIgtCklRkQUiSiiwISVKRBSFJKrIgJElFFoQkqciC\nkCQVWRCSpKKWCiIiPtbKmCRp+9HqFsT5hbELxjGHJKnL9I32YkScC/w+sCAibm56aRbwdCeDSZLq\nNWpBAD8GVgG7A19oGn8OWNypUJKk+o1aEJm5AlgB/NrExJEkdYtWd1KfFRFLI2JtRKyLiOciYl2n\nw0mS6jPWFNNmlwO/lZm/6GQYSVL3aPUopicsB0maXMY6iums6uFQRFwP/BOwfvPrmXljB7NJkmo0\n1hTTbzU9fhF4e9PzBCwISdpOjXUU04UTFUSS1F1a2kkdEV8qDK8FhjLzpnZXWp2m4wNAAF/OzC9G\nxK7A9cB84CHg7Mx8pt2vLUkaH63upJ4GHAEsrW6HA/OAhRHxxXZWGBGH0SiHY4A3A++KiAOBS4Fb\nM/Mg4NbquSSpJq0e5no4cHxmbgKIiL8BfgicANzb5jr/FXBnZr5Yfa3/A5wFnAGcVC1zDXAbcEmb\nX1uSNE5a3YLYBdix6flMYNeqMNaX37JVS4ATI2K3iJgBvBPYB5ibmauqZR4H5rb5dSVJ46idD8rd\nHRG30dhv8FbgMxExE/jf7awwM38REZ8Dvge8ANwNbBqxTEZElt4fERcBFwHsu+++7axaktSGyCz+\nHd5ywYg9aew3ALgrMx8blwARnwFWAh8DTsrMVdW6bsvMQ0Z77+DgYA4NDY1HDEmaNCJiUWYOjrXc\nqFNMEfHG6v5IYE/gkeq2RzW2reHmVPf70tj/8A/Azbx63YnzgbaPjpIkjZ+xppg+QWM65wuF1xI4\neRvX+48RsRuwAfhIZj4bEZ8Fvh4RC2mcQfbsbfzakqRxMNYH5S6q7t82nivNzBMLY08Bp4zneiRJ\n267V033PiIg/jogrqucHRcS7OhtNklSnVg9zvQp4BTiuev4o8F86kkiS1BVaLYgDMvNyGvsMqD7k\nFh1LJUmqXasF8UpETKexY5qIOID2PyAnSeohrX5Q7jLgFmCfiPgqcDxwQadCSZLq12pBnA98G7gB\n+BfgY5n5ZMdSSZJq12pBfAU4ETgVOAD4eUTcnpn/rWPJJEm1aqkgMvMHEXE7cDTwNuBDwKGABSFJ\n26lWLxh0K40zuP6Exmm+j87M1Z0MJkmqV6tHMS2m8TmIw2hcG+Kw6qgmSdJ2qtUpposBImIWjaOX\nrgL2AAY6lkySVKtWp5g+SmMn9VE0rhd9JY2pJknSdqrVo5imAf8VWJSZGzuYR5LUJVqdYvp8p4NI\nkrpLqzupJUmTjAUhSSqyICRJRRaEJKmo1aOYJOk1MpPn1m9keDiZNa2fqVO8RMz2xoKQ1JblT77A\nVT9azjeGVrJxeJggGM7k7YfO5QMn7s8R++xMhGWxPbAgJLVkeDj5z9+6j+t++gjDw8mG4axeadzf\nsuRxfvDAGo7abxf+9v1HMXPAPy+9zn0QksaUmfzRDfdw/V0rWb9xuKkcXjWc8NKGTdz10NO854qf\n8PKGTTUk1XiyICSN6VuLV/Gdex/npRb+6K/fOMzSJ57n8lsemIBk6iQLQtKY/vL7y1oqh83Wbxzm\na3c94lZEj7MgJI3q/sfWseLpF7bpvf/znsfGOY0mkgUhaVSLVjxNbrnLYUwvvrKJ/7vMS9f3MgtC\n0qheeGUTmwo7pVux7qUN45xGE8mCkDSqmQN99E3dts817DS9f5zTaCJZEJJG9ZYFu27T+2buMJWT\nDpkzzmk0kSwISaM6eO4sDpy9Y/tvjOD0N+0x/oE0YSwISWP66MkHMb1/asvLT+ufwnm/th8Dfa2/\nR93HgpA0pncctgdnD85rqSSm9U/hTXvvxCdOPXgCkqmTLAhJLfn0bx/KwhMWMNA3hYG+Lf90TJ3S\nKIeTDp7D/1j4Fvqn+uel13k2LUktiQg+edohvO/Y/fj7O1bw93euYO1LGwhgoG8qZxyxFwtPWMBB\nc2fVHVXjJHJbPgHTJQYHB3NoaKjuGNKktXHTMMMJOxS2KNS9ImJRZg6OtZxbEJK2WZ/TSNu1Wv7r\nRsTFEXFfRCyJiOsiYlpELIiIOyNiWURcHxE71JFNktQw4QUREXsD/x4YzMzDgKnAOcDngD/PzAOB\nZ4CFE51NkvSqurYP+4DpEdEHzABWAScDN1SvXwOcWVM2SRI1FERmPgp8HniYRjGsBRYBz2bmxmqx\nlcDeE51NkvSqOqaYdgHOABYAewEzgXe08f6LImIoIobWrFnToZSSpDqmmH4DWJ6ZazJzA3AjcDyw\nczXlBDAPeLT05sy8IjMHM3Nw9uzZE5NYkiahOgriYeDYiJgREQGcAtwP/AB4d7XM+cBNNWSTJFXq\n2AdxJ42d0T8D7q0yXAFcAnwiIpYBuwFfmehskqRX1fJBucy8DLhsxPC/AMfUEEeSVODHICVJRRaE\nJKnIgpAkFVkQkqQiC0KSVGRBSJKKLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkIgtCklRkQUiS\niiwISVKRBSFJKrIgJElFFoQkqciCkCQVWRCSpCILQpJUZEFIkoosCElSkQUhSSqyICRJRRaEJKnI\ngpAkFVkQkqQiC0KSVGRBSJKKLAhJUpEFIUkqsiAkSUUWhCSpyIKQJBVZEJKkogkviIg4JCLubrqt\ni4iPR8SuEfHPEbG0ut9lorNJkl414QWRmQ9m5hGZeQRwFPAi8E3gUuDWzDwIuLV6LkmqSd1TTKcA\nv8rMFcAZwDXV+DXAmbWlkiTVXhDnANdVj+dm5qrq8ePA3HoiSZKgxoKIiB2A3wa+MfK1zEwgt/K+\niyJiKCKG1qxZ0+GUkjR51bkFcTrws8x8onr+RETsCVDdry69KTOvyMzBzBycPXv2BEWVpMmnzoI4\nl1enlwBuBs6vHp8P3NSpFT/38gaWrX6eh596kU3DxQ0VSZr0+upYaUTMBE4FPtg0/Fng6xGxEFgB\nnD3e613y6Fq+9P2l3PbAGvqnBsMJ0/qncOHxC7jg+Pm8YVr/eK9SknpWNKb7e9Pg4GAODQ21tOxN\nP3+US25czPqNw4z8lgf6pjB71gA3fvg45sya1oGkktQ9ImJRZg6OtVzdRzFNiHtXruWSGxfz8oYt\nywFg/cZhHl/7Mud95af0cmFK0niaFAXxF99fyvqNw6Mus3E4efjpF7nroWcmKJUkdbftviDWvbyB\n2x5cU9xyGOmlVzZx1Y+Wdz6UJPWA7b4gVq97mf6+aGnZBJY/+UJnA0lSj9juC6J/6hSGR59d2mJ5\nSdIkKIh5u8xgoL+1b3OgbwonHeKH7yQJJkFBTJ0SXHjcfAb6xv5WE3j/sft1PpQk9YDtviAALjxh\nAbvvOMBos0fT+6fyobfuz5w3+DkISYJJUhBvmNbPNz9yHAfOnsWMHabSvMt6oG8KA31T+HcnLuDi\nUw+uLaMkdZtaTrVRhzmzpnHLx0/kzuVPc/WPH2L5mhfYoS846ZA5vO/Y/ZjrloMkvcakKQiAiODY\n/Xfj2P13qzuKJHW9STHFJElqnwUhSSqyICRJRT19uu+IWEPj2hHbanfgyXGKM9F6NXuv5gaz16FX\nc0N3Z98vM8f8VHBPF8TrFRFDrZwTvRv1avZezQ1mr0Ov5obezr6ZU0ySpCILQpJUNNkL4oq6A7wO\nvZq9V3OD2evQq7mht7MDk3wfhCRp6yb7FoQkaSu224KIiCsjYnVELGkauz4i7q5uD0XE3dX4qRGx\nKCLure5Pri95e9mbXt83Ip6PiE9OfOLX5Ggre0QcHhE/iYj7qp9/bSfFavPfTH9EXFNl/kVEfKrL\nch8REXdUuYci4phqPCLiSxGxLCIWR8SRdeWu8rST/b1V5nsj4scR8eb6kreXven1oyNiY0S8e+IT\nb4PM3C5vwFuBI4ElW3n9C8B/qh7/G2Cv6vFhwKO9kr1p7AbgG8AneyU7jXOBLQbeXD3fDZjaI9l/\nH/ha9XgG8BAwv1tyA98DTq8evxO4renx/wICOBa4s9v+vYyS/Thgl+rx6b2UvXo+Ffg+8B3g3XVm\nb/W23W5BZObtwNOl1yIigLOB66plf56Zj1Uv3wdMj4iBCQla0E72auxMYDmN7LVqM/vbgcWZeU/1\n3qcyc9OEBC1oM3sCMyOiD5gOvAKsm4icI20ldwJvqB7vBGz+930GcG023AHsHBF7TkzSLbWTPTN/\nnJnPVON3APMmJORWtPlzB/hD4B+B1Z1PNz4m1dlcm5wIPJGZSwuv/S7ws8xcP8GZWvWa7BGxI3AJ\ncCpQ6/RSC0b+3A8GMiK+C8ym8X/kl9eWbnQjs99A44/tKhpbEBdnZrFcavJx4LsR8XkaU8nHVeN7\nA480LbeyGls1sfFGtbXszRbS2BLqNsXsEbE38DvA24Cj64vXnu12C2IM59L0f+CbRcShwOeAD054\notaNzP5p4M8z8/l64rRlZPY+4ATgvdX970TEKXUEa8HI7McAm4C9gAXAf4iI/esIthV/QKO09gEu\nBr5Sc552jJo9It5GoyAuqSHbWLaW/YvAJZk5XFuybVH3HFcnb8B8Rswn0/ij9AQwb8T4POCXwPF1\n524nO/BDGvPfDwHP0tjk/WiPZD8HuKbp+Z8Af9Qj2f8KeH/T8yuBs7slN7CWVw9jD2Bd9fhvgXOb\nlnsQ2LObfuZby149Pxz4FXBwnZm34ee+vOn39Hka00xn1p1/rNtk3IL4DeCBzFy5eSAidga+DVya\nmT+qLdnYtsiemSdm5vzMnE/j/1I+k5l/WVfAUWyRHfgu8KaImFHN5f86cH8t6UZXyv4wcDJARMyk\nscP3gRqybc1jNH6e0Mi5eWrsZuC86mimY4G1mdlN00uwlewRsS9wI41i/mVN2cZSzJ6ZC5p+T28A\nPpyZ/1RPxDbU3VAdbPbraMyrbqAxz7qwGr8a+NCIZf8YeAG4u+k2pxeyj3jfp6n/KKa2sgPvo7Fz\nfQlwea9kB3akcdTYfTRKrbYtn1JuGlN2i4B7gDuBo6plg8bWz6+Ae4HBbvuZj5L974Bnmn5Hh3ol\n+4j3XU2PHMXkJ6klSUWTcYpJktQCC0KSVGRBSJKKLAhJUpEFIUkqsiCkrYiI+c1n6mxh+Q9FxHlj\nLHNBRBQ/pxIR/7HdjFInWRDSOMnM/56Z176OL2FBqKtYENLopkbEl6vrVXwvIqZHxAERcUt17ZAf\nRsQbASLi05uvx1Gd939xdV2APxuxJbJX9f6lEXF5tfxnaZxF+O6I+OrEf5vSliwIaXQHAX+VmYfS\nONfV79K41vAfZuZRNM6g+9eF910FfDAzj6BxUr9mRwDvAd4EvCci9snMS4GXMvOIzHxvh74XqS2T\n9XTfUquWZ+bmK+AtonFytuOAbzQuEQHAa64dUp3ba1Zm/qQa+gfgXU2L3JqZa6tl7wf247Wn4Ja6\nggUhja75uiCbgLnAs9WWwXh9TX8P1ZWcYpLasw5YHhG/B///Gs+vuTZyZj4LPBcRb6mGzmnxa2+I\niP7xiyq9PhaE1L73Agsj4h4aZ3M9o7DMQuDLEXE3MJPGdQLGcgWw2J3U6haezVXqgIjYMaur/EXE\npTQuyvOxmmNJbXHuU+qM34yIT9H4HVsBXFBvHKl9bkFIkorcByFJKrIgJElFFoQkqciCkCQVWRCS\npCILQpJU9P8AYzsGxmdb6ocAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "people.plot(kind = \"scatter\", x = \"height\", y = \"weight\", s=[40, 120, 200])\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, there are way too many options to list here: the best option is to scroll through the [Visualization](http://pandas.pydata.org/pandas-docs/stable/visualization.html) page in pandas' documentation, find the plot you are interested in and look at the example code." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Operations on `DataFrame`s\n", "Although `DataFrame`s do not try to mimick NumPy arrays, there are a few similarities. Let's create a `DataFrame` to demonstrate this:" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice889
bob1099
charles482
darwin91010
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 8 8 9\n", "bob 10 9 9\n", "charles 4 8 2\n", "darwin 9 10 10" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades_array = np.array([[8,8,9],[10,9,9],[4, 8, 2], [9, 10, 10]])\n", "grades = pd.DataFrame(grades_array, columns=[\"sep\", \"oct\", \"nov\"], index=[\"alice\",\"bob\",\"charles\",\"darwin\"])\n", "grades" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can apply NumPy mathematical functions on a `DataFrame`: the function is applied to all values:" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice2.8284272.8284273.000000
bob3.1622783.0000003.000000
charles2.0000002.8284271.414214
darwin3.0000003.1622783.162278
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 2.828427 2.828427 3.000000\n", "bob 3.162278 3.000000 3.000000\n", "charles 2.000000 2.828427 1.414214\n", "darwin 3.000000 3.162278 3.162278" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sqrt(grades)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, adding a single value to a `DataFrame` will add that value to all elements in the `DataFrame`. This is called *broadcasting*:" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice9910
bob111010
charles593
darwin101111
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 9 9 10\n", "bob 11 10 10\n", "charles 5 9 3\n", "darwin 10 11 11" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades + 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course, the same is true for all other binary operations, including arithmetic (`*`,`/`,`**`...) and conditional (`>`, `==`...) operations:" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
aliceTrueTrueTrue
bobTrueTrueTrue
charlesFalseTrueFalse
darwinTrueTrueTrue
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice True True True\n", "bob True True True\n", "charles False True False\n", "darwin True True True" ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades >= 5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Aggregation operations, such as computing the `max`, the `sum` or the `mean` of a `DataFrame`, apply to each column, and you get back a `Series` object:" ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "sep 7.75\n", "oct 8.75\n", "nov 7.50\n", "dtype: float64" ] }, "execution_count": 97, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `all` method is also an aggregation operation: it checks whether all values are `True` or not. Let's see during which months all students got a grade greater than `5`:" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "sep False\n", "oct True\n", "nov False\n", "dtype: bool" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(grades > 5).all()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most of these functions take an optional `axis` parameter which lets you specify along which axis of the `DataFrame` you want the operation executed. The default is `axis=0`, meaning that the operation is executed vertically (on each column). You can set `axis=1` to execute the operation horizontally (on each row). For example, let's find out which students had all grades greater than `5`:" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice True\n", "bob True\n", "charles False\n", "darwin True\n", "dtype: bool" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(grades > 5).all(axis = 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `any` method returns `True` if any value is True. Let's see who got at least one grade 10:" ] }, { "cell_type": "code", "execution_count": 100, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "alice False\n", "bob True\n", "charles False\n", "darwin True\n", "dtype: bool" ] }, "execution_count": 100, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(grades == 10).any(axis = 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you add a `Series` object to a `DataFrame` (or execute any other binary operation), pandas attempts to broadcast the operation to all *rows* in the `DataFrame`. This only works if the `Series` has the same size as the `DataFrame`s rows. For example, let's subtract the `mean` of the `DataFrame` (a `Series` object) from the `DataFrame`:" ] }, { "cell_type": "code", "execution_count": 101, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice0.25-0.751.5
bob2.250.251.5
charles-3.75-0.75-5.5
darwin1.251.252.5
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 0.25 -0.75 1.5\n", "bob 2.25 0.25 1.5\n", "charles -3.75 -0.75 -5.5\n", "darwin 1.25 1.25 2.5" ] }, "execution_count": 101, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades - grades.mean() # equivalent to: grades - [7.75, 8.75, 7.50]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We subtracted `7.75` from all September grades, `8.75` from October grades and `7.50` from November grades. It is equivalent to subtracting this `DataFrame`:" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice7.758.757.5
bob7.758.757.5
charles7.758.757.5
darwin7.758.757.5
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 7.75 8.75 7.5\n", "bob 7.75 8.75 7.5\n", "charles 7.75 8.75 7.5\n", "darwin 7.75 8.75 7.5" ] }, "execution_count": 102, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame([[7.75, 8.75, 7.50]]*4, index=grades.index, columns=grades.columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to subtract the global mean from every grade, here is one way to do it:" ] }, { "cell_type": "code", "execution_count": 103, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice0.00.01.0
bob2.01.01.0
charles-4.00.0-6.0
darwin1.02.02.0
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 0.0 0.0 1.0\n", "bob 2.0 1.0 1.0\n", "charles -4.0 0.0 -6.0\n", "darwin 1.0 2.0 2.0" ] }, "execution_count": 103, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades - grades.values.mean() # subtracts the global mean (8.00) from all grades" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Automatic alignment\n", "Similar to `Series`, when operating on multiple `DataFrame`s, pandas automatically aligns them by row index label, but also by column names. Let's create a `DataFrame` with bonus points for each person from October to December:" ] }, { "cell_type": "code", "execution_count": 104, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
octnovdec
bob0.0NaN2.0
colinNaN1.00.0
darwin0.01.00.0
charles3.03.00.0
\n", "
" ], "text/plain": [ " oct nov dec\n", "bob 0.0 NaN 2.0\n", "colin NaN 1.0 0.0\n", "darwin 0.0 1.0 0.0\n", "charles 3.0 3.0 0.0" ] }, "execution_count": 104, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bonus_array = np.array([[0,np.nan,2],[np.nan,1,0],[0, 1, 0], [3, 3, 0]])\n", "bonus_points = pd.DataFrame(bonus_array, columns=[\"oct\", \"nov\", \"dec\"], index=[\"bob\",\"colin\", \"darwin\", \"charles\"])\n", "bonus_points" ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
decnovoctsep
aliceNaNNaNNaNNaN
bobNaNNaN9.0NaN
charlesNaN5.011.0NaN
colinNaNNaNNaNNaN
darwinNaN11.010.0NaN
\n", "
" ], "text/plain": [ " dec nov oct sep\n", "alice NaN NaN NaN NaN\n", "bob NaN NaN 9.0 NaN\n", "charles NaN 5.0 11.0 NaN\n", "colin NaN NaN NaN NaN\n", "darwin NaN 11.0 10.0 NaN" ] }, "execution_count": 105, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades + bonus_points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looks like the addition worked in some cases but way too many elements are now empty. That's because when aligning the `DataFrame`s, some columns and rows were only present on one side, and thus they were considered missing on the other side (`NaN`). Then adding `NaN` to a number results in `NaN`, hence the result.\n", "\n", "## Handling missing data\n", "Dealing with missing data is a frequent task when working with real life data. Pandas offers a few tools to handle missing data.\n", " \n", "Let's try to fix the problem above. For example, we can decide that missing data should result in a zero, instead of `NaN`. We can replace all `NaN` values by a any value using the `fillna()` method:" ] }, { "cell_type": "code", "execution_count": 106, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
decnovoctsep
alice0.00.00.00.0
bob0.00.09.00.0
charles0.05.011.00.0
colin0.00.00.00.0
darwin0.011.010.00.0
\n", "
" ], "text/plain": [ " dec nov oct sep\n", "alice 0.0 0.0 0.0 0.0\n", "bob 0.0 0.0 9.0 0.0\n", "charles 0.0 5.0 11.0 0.0\n", "colin 0.0 0.0 0.0 0.0\n", "darwin 0.0 11.0 10.0 0.0" ] }, "execution_count": 106, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(grades + bonus_points).fillna(0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's a bit unfair that we're setting grades to zero in September, though. Perhaps we should decide that missing grades are missing grades, but missing bonus points should be replaced by zeros:" ] }, { "cell_type": "code", "execution_count": 107, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
decnovoctsep
aliceNaN9.08.08.0
bobNaN9.09.010.0
charlesNaN5.011.04.0
colinNaNNaNNaNNaN
darwinNaN11.010.09.0
\n", "
" ], "text/plain": [ " dec nov oct sep\n", "alice NaN 9.0 8.0 8.0\n", "bob NaN 9.0 9.0 10.0\n", "charles NaN 5.0 11.0 4.0\n", "colin NaN NaN NaN NaN\n", "darwin NaN 11.0 10.0 9.0" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fixed_bonus_points = bonus_points.fillna(0)\n", "fixed_bonus_points.insert(0, \"sep\", 0)\n", "fixed_bonus_points.loc[\"alice\"] = 0\n", "grades + fixed_bonus_points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's much better: although we made up some data, we have not been too unfair.\n", "\n", "Another way to handle missing data is to interpolate. Let's look at the `bonus_points` `DataFrame` again:" ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
octnovdec
bob0.0NaN2.0
colinNaN1.00.0
darwin0.01.00.0
charles3.03.00.0
\n", "
" ], "text/plain": [ " oct nov dec\n", "bob 0.0 NaN 2.0\n", "colin NaN 1.0 0.0\n", "darwin 0.0 1.0 0.0\n", "charles 3.0 3.0 0.0" ] }, "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bonus_points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's call the `interpolate` method. By default, it interpolates vertically (`axis=0`), so let's tell it to interpolate horizontally (`axis=1`)." ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
octnovdec
bob0.01.02.0
colinNaN1.00.0
darwin0.01.00.0
charles3.03.00.0
\n", "
" ], "text/plain": [ " oct nov dec\n", "bob 0.0 1.0 2.0\n", "colin NaN 1.0 0.0\n", "darwin 0.0 1.0 0.0\n", "charles 3.0 3.0 0.0" ] }, "execution_count": 109, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bonus_points.interpolate(axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bob had 0 bonus points in October, and 2 in December. When we interpolate for November, we get the mean: 1 bonus point. Colin had 1 bonus point in November, but we do not know how many bonus points he had in September, so we cannot interpolate, this is why there is still a missing value in October after interpolation. To fix this, we can set the September bonus points to 0 before interpolation." ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnovdec
bob0.00.01.02.0
colin0.00.51.00.0
darwin0.00.01.00.0
charles0.03.03.00.0
alice0.00.00.00.0
\n", "
" ], "text/plain": [ " sep oct nov dec\n", "bob 0.0 0.0 1.0 2.0\n", "colin 0.0 0.5 1.0 0.0\n", "darwin 0.0 0.0 1.0 0.0\n", "charles 0.0 3.0 3.0 0.0\n", "alice 0.0 0.0 0.0 0.0" ] }, "execution_count": 110, "metadata": {}, "output_type": "execute_result" } ], "source": [ "better_bonus_points = bonus_points.copy()\n", "better_bonus_points.insert(0, \"sep\", 0)\n", "better_bonus_points.loc[\"alice\"] = 0\n", "better_bonus_points = better_bonus_points.interpolate(axis=1)\n", "better_bonus_points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great, now we have reasonable bonus points everywhere. Let's find out the final grades:" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
decnovoctsep
aliceNaN9.08.08.0
bobNaN10.09.010.0
charlesNaN5.011.04.0
colinNaNNaNNaNNaN
darwinNaN11.010.09.0
\n", "
" ], "text/plain": [ " dec nov oct sep\n", "alice NaN 9.0 8.0 8.0\n", "bob NaN 10.0 9.0 10.0\n", "charles NaN 5.0 11.0 4.0\n", "colin NaN NaN NaN NaN\n", "darwin NaN 11.0 10.0 9.0" ] }, "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades + better_bonus_points" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is slightly annoying that the September column ends up on the right. This is because the `DataFrame`s we are adding do not have the exact same columns (the `grades` `DataFrame` is missing the `\"dec\"` column), so to make things predictable, pandas orders the final columns alphabetically. To fix this, we can simply add the missing column before adding:" ] }, { "cell_type": "code", "execution_count": 112, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnovdec
alice8.08.09.0NaN
bob10.09.010.0NaN
charles4.011.05.0NaN
colinNaNNaNNaNNaN
darwin9.010.011.0NaN
\n", "
" ], "text/plain": [ " sep oct nov dec\n", "alice 8.0 8.0 9.0 NaN\n", "bob 10.0 9.0 10.0 NaN\n", "charles 4.0 11.0 5.0 NaN\n", "colin NaN NaN NaN NaN\n", "darwin 9.0 10.0 11.0 NaN" ] }, "execution_count": 112, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grades[\"dec\"] = np.nan\n", "final_grades = grades + better_bonus_points\n", "final_grades" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There's not much we can do about December and Colin: it's bad enough that we are making up bonus points, but we can't reasonably make up grades (well I guess some teachers probably do). So let's call the `dropna()` method to get rid of rows that are full of `NaN`s:" ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnovdec
alice8.08.09.0NaN
bob10.09.010.0NaN
charles4.011.05.0NaN
darwin9.010.011.0NaN
\n", "
" ], "text/plain": [ " sep oct nov dec\n", "alice 8.0 8.0 9.0 NaN\n", "bob 10.0 9.0 10.0 NaN\n", "charles 4.0 11.0 5.0 NaN\n", "darwin 9.0 10.0 11.0 NaN" ] }, "execution_count": 113, "metadata": {}, "output_type": "execute_result" } ], "source": [ "final_grades_clean = final_grades.dropna(how=\"all\")\n", "final_grades_clean" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's remove columns that are full of `NaN`s by setting the `axis` argument to `1`:" ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnov
alice8.08.09.0
bob10.09.010.0
charles4.011.05.0
darwin9.010.011.0
\n", "
" ], "text/plain": [ " sep oct nov\n", "alice 8.0 8.0 9.0\n", "bob 10.0 9.0 10.0\n", "charles 4.0 11.0 5.0\n", "darwin 9.0 10.0 11.0" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "final_grades_clean = final_grades_clean.dropna(axis=1, how=\"all\")\n", "final_grades_clean" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Aggregating with `groupby`\n", "Similar to the SQL language, pandas allows grouping your data into groups to run calculations over each group.\n", "\n", "First, let's add some extra data about each person so we can group them, and let's go back to the `final_grades` `DataFrame` so we can see how `NaN` values are handled:" ] }, { "cell_type": "code", "execution_count": 115, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnovdechobby
alice8.08.09.0NaNBiking
bob10.09.010.0NaNDancing
charles4.011.05.0NaNNaN
colinNaNNaNNaNNaNDancing
darwin9.010.011.0NaNBiking
\n", "
" ], "text/plain": [ " sep oct nov dec hobby\n", "alice 8.0 8.0 9.0 NaN Biking\n", "bob 10.0 9.0 10.0 NaN Dancing\n", "charles 4.0 11.0 5.0 NaN NaN\n", "colin NaN NaN NaN NaN Dancing\n", "darwin 9.0 10.0 11.0 NaN Biking" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "final_grades[\"hobby\"] = [\"Biking\", \"Dancing\", np.nan, \"Dancing\", \"Biking\"]\n", "final_grades" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's group data in this `DataFrame` by hobby:" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grouped_grades = final_grades.groupby(\"hobby\")\n", "grouped_grades" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are ready to compute the average grade per hobby:" ] }, { "cell_type": "code", "execution_count": 117, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sepoctnovdec
hobby
Biking8.59.010.0NaN
Dancing10.09.010.0NaN
\n", "
" ], "text/plain": [ " sep oct nov dec\n", "hobby \n", "Biking 8.5 9.0 10.0 NaN\n", "Dancing 10.0 9.0 10.0 NaN" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grouped_grades.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That was easy! Note that the `NaN` values have simply been skipped when computing the means." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pivot tables\n", "Pandas supports spreadsheet-like [pivot tables](https://en.wikipedia.org/wiki/Pivot_table) that allow quick data summarization. To illustrate this, let's create a simple `DataFrame`:" ] }, { "cell_type": "code", "execution_count": 118, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
octnovdec
bob0.0NaN2.0
colinNaN1.00.0
darwin0.01.00.0
charles3.03.00.0
\n", "
" ], "text/plain": [ " oct nov dec\n", "bob 0.0 NaN 2.0\n", "colin NaN 1.0 0.0\n", "darwin 0.0 1.0 0.0\n", "charles 3.0 3.0 0.0" ] }, "execution_count": 118, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bonus_points" ] }, { "cell_type": "code", "execution_count": 119, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
namemonthgradebonus
0alicesep8.0NaN
1aliceoct8.0NaN
2alicenov9.0NaN
3bobsep10.00.0
4boboct9.0NaN
5bobnov10.02.0
6charlessep4.03.0
7charlesoct11.03.0
8charlesnov5.00.0
9darwinsep9.00.0
10darwinoct10.01.0
11darwinnov11.00.0
\n", "
" ], "text/plain": [ " name month grade bonus\n", "0 alice sep 8.0 NaN\n", "1 alice oct 8.0 NaN\n", "2 alice nov 9.0 NaN\n", "3 bob sep 10.0 0.0\n", "4 bob oct 9.0 NaN\n", "5 bob nov 10.0 2.0\n", "6 charles sep 4.0 3.0\n", "7 charles oct 11.0 3.0\n", "8 charles nov 5.0 0.0\n", "9 darwin sep 9.0 0.0\n", "10 darwin oct 10.0 1.0\n", "11 darwin nov 11.0 0.0" ] }, "execution_count": 119, "metadata": {}, "output_type": "execute_result" } ], "source": [ "more_grades = final_grades_clean.stack().reset_index()\n", "more_grades.columns = [\"name\", \"month\", \"grade\"]\n", "more_grades[\"bonus\"] = [np.nan, np.nan, np.nan, 0, np.nan, 2, 3, 3, 0, 0, 1, 0]\n", "more_grades" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can call the `pd.pivot_table()` function for this `DataFrame`, asking to group by the `name` column. By default, `pivot_table()` computes the mean of each numeric column:" ] }, { "cell_type": "code", "execution_count": 120, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
bonusgrade
name
aliceNaN8.333333
bob1.0000009.666667
charles2.0000006.666667
darwin0.33333310.000000
\n", "
" ], "text/plain": [ " bonus grade\n", "name \n", "alice NaN 8.333333\n", "bob 1.000000 9.666667\n", "charles 2.000000 6.666667\n", "darwin 0.333333 10.000000" ] }, "execution_count": 120, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.pivot_table(more_grades, index=\"name\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can change the aggregation function by setting the `aggfunc` argument, and we can also specify the list of columns whose values will be aggregated:" ] }, { "cell_type": "code", "execution_count": 121, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
bonusgrade
name
aliceNaN9.0
bob2.010.0
charles3.011.0
darwin1.011.0
\n", "
" ], "text/plain": [ " bonus grade\n", "name \n", "alice NaN 9.0\n", "bob 2.0 10.0\n", "charles 3.0 11.0\n", "darwin 1.0 11.0" ] }, "execution_count": 121, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.pivot_table(more_grades, index=\"name\", values=[\"grade\",\"bonus\"], aggfunc=np.max)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also specify the `columns` to aggregate over horizontally, and request the grand totals for each row and column by setting `margins=True`:" ] }, { "cell_type": "code", "execution_count": 122, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
monthnovoctsepAll
name
alice9.008.08.008.333333
bob10.009.010.009.666667
charles5.0011.04.006.666667
darwin11.0010.09.0010.000000
All8.759.57.758.666667
\n", "
" ], "text/plain": [ "month nov oct sep All\n", "name \n", "alice 9.00 8.0 8.00 8.333333\n", "bob 10.00 9.0 10.00 9.666667\n", "charles 5.00 11.0 4.00 6.666667\n", "darwin 11.00 10.0 9.00 10.000000\n", "All 8.75 9.5 7.75 8.666667" ] }, "execution_count": 122, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.pivot_table(more_grades, index=\"name\", values=\"grade\", columns=\"month\", margins=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can specify multiple index or column names, and pandas will create multi-level indices:" ] }, { "cell_type": "code", "execution_count": 123, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
bonusgrade
namemonth
alicenovNaN9.00
octNaN8.00
sepNaN8.00
bobnov2.00010.00
octNaN9.00
sep0.00010.00
charlesnov0.0005.00
oct3.00011.00
sep3.0004.00
darwinnov0.00011.00
oct1.00010.00
sep0.0009.00
All1.1258.75
\n", "
" ], "text/plain": [ " bonus grade\n", "name month \n", "alice nov NaN 9.00\n", " oct NaN 8.00\n", " sep NaN 8.00\n", "bob nov 2.000 10.00\n", " oct NaN 9.00\n", " sep 0.000 10.00\n", "charles nov 0.000 5.00\n", " oct 3.000 11.00\n", " sep 3.000 4.00\n", "darwin nov 0.000 11.00\n", " oct 1.000 10.00\n", " sep 0.000 9.00\n", "All 1.125 8.75" ] }, "execution_count": 123, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.pivot_table(more_grades, index=(\"name\", \"month\"), margins=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview functions\n", "When dealing with large `DataFrames`, it is useful to get a quick overview of its content. Pandas offers a few functions for this. First, let's create a large `DataFrame` with a mix of numeric values, missing values and text values. Notice how Jupyter displays only the corners of the `DataFrame`:" ] }, { "cell_type": "code", "execution_count": 124, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCsome_textDEFGHI...QRSTUVWXYZ
0NaN11.044.0Blabla99.0NaN88.022.0165.0143.0...11.0NaN11.044.099.0NaN88.022.0165.0143.0
111.022.055.0Blabla110.0NaN99.033.0NaN154.0...22.011.022.055.0110.0NaN99.033.0NaN154.0
222.033.066.0Blabla121.011.0110.044.0NaN165.0...33.022.033.066.0121.011.0110.044.0NaN165.0
333.044.077.0Blabla132.022.0121.055.011.0NaN...44.033.044.077.0132.022.0121.055.011.0NaN
444.055.088.0Blabla143.033.0132.066.022.0NaN...55.044.055.088.0143.033.0132.066.022.0NaN
555.066.099.0Blabla154.044.0143.077.033.011.0...66.055.066.099.0154.044.0143.077.033.011.0
666.077.0110.0Blabla165.055.0154.088.044.022.0...77.066.077.0110.0165.055.0154.088.044.022.0
777.088.0121.0BlablaNaN66.0165.099.055.033.0...88.077.088.0121.0NaN66.0165.099.055.033.0
888.099.0132.0BlablaNaN77.0NaN110.066.044.0...99.088.099.0132.0NaN77.0NaN110.066.044.0
999.0110.0143.0Blabla11.088.0NaN121.077.055.0...110.099.0110.0143.011.088.0NaN121.077.055.0
10110.0121.0154.0Blabla22.099.011.0132.088.066.0...121.0110.0121.0154.022.099.011.0132.088.066.0
11121.0132.0165.0Blabla33.0110.022.0143.099.077.0...132.0121.0132.0165.033.0110.022.0143.099.077.0
12132.0143.0NaNBlabla44.0121.033.0154.0110.088.0...143.0132.0143.0NaN44.0121.033.0154.0110.088.0
13143.0154.0NaNBlabla55.0132.044.0165.0121.099.0...154.0143.0154.0NaN55.0132.044.0165.0121.099.0
14154.0165.011.0Blabla66.0143.055.0NaN132.0110.0...165.0154.0165.011.066.0143.055.0NaN132.0110.0
15165.0NaN22.0Blabla77.0154.066.0NaN143.0121.0...NaN165.0NaN22.077.0154.066.0NaN143.0121.0
16NaNNaN33.0Blabla88.0165.077.011.0154.0132.0...NaNNaNNaN33.088.0165.077.011.0154.0132.0
17NaN11.044.0Blabla99.0NaN88.022.0165.0143.0...11.0NaN11.044.099.0NaN88.022.0165.0143.0
1811.022.055.0Blabla110.0NaN99.033.0NaN154.0...22.011.022.055.0110.0NaN99.033.0NaN154.0
1922.033.066.0Blabla121.011.0110.044.0NaN165.0...33.022.033.066.0121.011.0110.044.0NaN165.0
2033.044.077.0Blabla132.022.0121.055.011.0NaN...44.033.044.077.0132.022.0121.055.011.0NaN
2144.055.088.0Blabla143.033.0132.066.022.0NaN...55.044.055.088.0143.033.0132.066.022.0NaN
2255.066.099.0Blabla154.044.0143.077.033.011.0...66.055.066.099.0154.044.0143.077.033.011.0
2366.077.0110.0Blabla165.055.0154.088.044.022.0...77.066.077.0110.0165.055.0154.088.044.022.0
2477.088.0121.0BlablaNaN66.0165.099.055.033.0...88.077.088.0121.0NaN66.0165.099.055.033.0
2588.099.0132.0BlablaNaN77.0NaN110.066.044.0...99.088.099.0132.0NaN77.0NaN110.066.044.0
2699.0110.0143.0Blabla11.088.0NaN121.077.055.0...110.099.0110.0143.011.088.0NaN121.077.055.0
27110.0121.0154.0Blabla22.099.011.0132.088.066.0...121.0110.0121.0154.022.099.011.0132.088.066.0
28121.0132.0165.0Blabla33.0110.022.0143.099.077.0...132.0121.0132.0165.033.0110.022.0143.099.077.0
29132.0143.0NaNBlabla44.0121.033.0154.0110.088.0...143.0132.0143.0NaN44.0121.033.0154.0110.088.0
..................................................................
997088.099.0132.0BlablaNaN77.0NaN110.066.044.0...99.088.099.0132.0NaN77.0NaN110.066.044.0
997199.0110.0143.0Blabla11.088.0NaN121.077.055.0...110.099.0110.0143.011.088.0NaN121.077.055.0
9972110.0121.0154.0Blabla22.099.011.0132.088.066.0...121.0110.0121.0154.022.099.011.0132.088.066.0
9973121.0132.0165.0Blabla33.0110.022.0143.099.077.0...132.0121.0132.0165.033.0110.022.0143.099.077.0
9974132.0143.0NaNBlabla44.0121.033.0154.0110.088.0...143.0132.0143.0NaN44.0121.033.0154.0110.088.0
9975143.0154.0NaNBlabla55.0132.044.0165.0121.099.0...154.0143.0154.0NaN55.0132.044.0165.0121.099.0
9976154.0165.011.0Blabla66.0143.055.0NaN132.0110.0...165.0154.0165.011.066.0143.055.0NaN132.0110.0
9977165.0NaN22.0Blabla77.0154.066.0NaN143.0121.0...NaN165.0NaN22.077.0154.066.0NaN143.0121.0
9978NaNNaN33.0Blabla88.0165.077.011.0154.0132.0...NaNNaNNaN33.088.0165.077.011.0154.0132.0
9979NaN11.044.0Blabla99.0NaN88.022.0165.0143.0...11.0NaN11.044.099.0NaN88.022.0165.0143.0
998011.022.055.0Blabla110.0NaN99.033.0NaN154.0...22.011.022.055.0110.0NaN99.033.0NaN154.0
998122.033.066.0Blabla121.011.0110.044.0NaN165.0...33.022.033.066.0121.011.0110.044.0NaN165.0
998233.044.077.0Blabla132.022.0121.055.011.0NaN...44.033.044.077.0132.022.0121.055.011.0NaN
998344.055.088.0Blabla143.033.0132.066.022.0NaN...55.044.055.088.0143.033.0132.066.022.0NaN
998455.066.099.0Blabla154.044.0143.077.033.011.0...66.055.066.099.0154.044.0143.077.033.011.0
998566.077.0110.0Blabla165.055.0154.088.044.022.0...77.066.077.0110.0165.055.0154.088.044.022.0
998677.088.0121.0BlablaNaN66.0165.099.055.033.0...88.077.088.0121.0NaN66.0165.099.055.033.0
998788.099.0132.0BlablaNaN77.0NaN110.066.044.0...99.088.099.0132.0NaN77.0NaN110.066.044.0
998899.0110.0143.0Blabla11.088.0NaN121.077.055.0...110.099.0110.0143.011.088.0NaN121.077.055.0
9989110.0121.0154.0Blabla22.099.011.0132.088.066.0...121.0110.0121.0154.022.099.011.0132.088.066.0
9990121.0132.0165.0Blabla33.0110.022.0143.099.077.0...132.0121.0132.0165.033.0110.022.0143.099.077.0
9991132.0143.0NaNBlabla44.0121.033.0154.0110.088.0...143.0132.0143.0NaN44.0121.033.0154.0110.088.0
9992143.0154.0NaNBlabla55.0132.044.0165.0121.099.0...154.0143.0154.0NaN55.0132.044.0165.0121.099.0
9993154.0165.011.0Blabla66.0143.055.0NaN132.0110.0...165.0154.0165.011.066.0143.055.0NaN132.0110.0
9994165.0NaN22.0Blabla77.0154.066.0NaN143.0121.0...NaN165.0NaN22.077.0154.066.0NaN143.0121.0
9995NaNNaN33.0Blabla88.0165.077.011.0154.0132.0...NaNNaNNaN33.088.0165.077.011.0154.0132.0
9996NaN11.044.0Blabla99.0NaN88.022.0165.0143.0...11.0NaN11.044.099.0NaN88.022.0165.0143.0
999711.022.055.0Blabla110.0NaN99.033.0NaN154.0...22.011.022.055.0110.0NaN99.033.0NaN154.0
999822.033.066.0Blabla121.011.0110.044.0NaN165.0...33.022.033.066.0121.011.0110.044.0NaN165.0
999933.044.077.0Blabla132.022.0121.055.011.0NaN...44.033.044.077.0132.022.0121.055.011.0NaN
\n", "

10000 rows × 27 columns

\n", "
" ], "text/plain": [ " A B C some_text D E F G H I \\\n", "0 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n", "1 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n", "2 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n", "3 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n", "4 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN \n", "5 55.0 66.0 99.0 Blabla 154.0 44.0 143.0 77.0 33.0 11.0 \n", "6 66.0 77.0 110.0 Blabla 165.0 55.0 154.0 88.0 44.0 22.0 \n", "7 77.0 88.0 121.0 Blabla NaN 66.0 165.0 99.0 55.0 33.0 \n", "8 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n", "9 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n", "10 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n", "11 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n", "12 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n", "13 143.0 154.0 NaN Blabla 55.0 132.0 44.0 165.0 121.0 99.0 \n", "14 154.0 165.0 11.0 Blabla 66.0 143.0 55.0 NaN 132.0 110.0 \n", "15 165.0 NaN 22.0 Blabla 77.0 154.0 66.0 NaN 143.0 121.0 \n", "16 NaN NaN 33.0 Blabla 88.0 165.0 77.0 11.0 154.0 132.0 \n", "17 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n", "18 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n", "19 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n", "20 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n", "21 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN \n", "22 55.0 66.0 99.0 Blabla 154.0 44.0 143.0 77.0 33.0 11.0 \n", "23 66.0 77.0 110.0 Blabla 165.0 55.0 154.0 88.0 44.0 22.0 \n", "24 77.0 88.0 121.0 Blabla NaN 66.0 165.0 99.0 55.0 33.0 \n", "25 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n", "26 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n", "27 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n", "28 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n", "29 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n", "... ... ... ... ... ... ... ... ... ... ... \n", "9970 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n", "9971 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n", "9972 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n", "9973 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n", "9974 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n", "9975 143.0 154.0 NaN Blabla 55.0 132.0 44.0 165.0 121.0 99.0 \n", "9976 154.0 165.0 11.0 Blabla 66.0 143.0 55.0 NaN 132.0 110.0 \n", "9977 165.0 NaN 22.0 Blabla 77.0 154.0 66.0 NaN 143.0 121.0 \n", "9978 NaN NaN 33.0 Blabla 88.0 165.0 77.0 11.0 154.0 132.0 \n", "9979 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n", "9980 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n", "9981 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n", "9982 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n", "9983 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN \n", "9984 55.0 66.0 99.0 Blabla 154.0 44.0 143.0 77.0 33.0 11.0 \n", "9985 66.0 77.0 110.0 Blabla 165.0 55.0 154.0 88.0 44.0 22.0 \n", "9986 77.0 88.0 121.0 Blabla NaN 66.0 165.0 99.0 55.0 33.0 \n", "9987 88.0 99.0 132.0 Blabla NaN 77.0 NaN 110.0 66.0 44.0 \n", "9988 99.0 110.0 143.0 Blabla 11.0 88.0 NaN 121.0 77.0 55.0 \n", "9989 110.0 121.0 154.0 Blabla 22.0 99.0 11.0 132.0 88.0 66.0 \n", "9990 121.0 132.0 165.0 Blabla 33.0 110.0 22.0 143.0 99.0 77.0 \n", "9991 132.0 143.0 NaN Blabla 44.0 121.0 33.0 154.0 110.0 88.0 \n", "9992 143.0 154.0 NaN Blabla 55.0 132.0 44.0 165.0 121.0 99.0 \n", "9993 154.0 165.0 11.0 Blabla 66.0 143.0 55.0 NaN 132.0 110.0 \n", "9994 165.0 NaN 22.0 Blabla 77.0 154.0 66.0 NaN 143.0 121.0 \n", "9995 NaN NaN 33.0 Blabla 88.0 165.0 77.0 11.0 154.0 132.0 \n", "9996 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 \n", "9997 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 \n", "9998 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n", "9999 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n", "\n", " ... Q R S T U V W X Y \\\n", "0 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n", "1 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n", "2 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n", "3 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n", "4 ... 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 \n", "5 ... 66.0 55.0 66.0 99.0 154.0 44.0 143.0 77.0 33.0 \n", "6 ... 77.0 66.0 77.0 110.0 165.0 55.0 154.0 88.0 44.0 \n", "7 ... 88.0 77.0 88.0 121.0 NaN 66.0 165.0 99.0 55.0 \n", "8 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n", "9 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n", "10 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n", "11 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n", "12 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n", "13 ... 154.0 143.0 154.0 NaN 55.0 132.0 44.0 165.0 121.0 \n", "14 ... 165.0 154.0 165.0 11.0 66.0 143.0 55.0 NaN 132.0 \n", "15 ... NaN 165.0 NaN 22.0 77.0 154.0 66.0 NaN 143.0 \n", "16 ... NaN NaN NaN 33.0 88.0 165.0 77.0 11.0 154.0 \n", "17 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n", "18 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n", "19 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n", "20 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n", "21 ... 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 \n", "22 ... 66.0 55.0 66.0 99.0 154.0 44.0 143.0 77.0 33.0 \n", "23 ... 77.0 66.0 77.0 110.0 165.0 55.0 154.0 88.0 44.0 \n", "24 ... 88.0 77.0 88.0 121.0 NaN 66.0 165.0 99.0 55.0 \n", "25 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n", "26 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n", "27 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n", "28 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n", "29 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n", "... ... ... ... ... ... ... ... ... ... ... \n", "9970 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n", "9971 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n", "9972 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n", "9973 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n", "9974 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n", "9975 ... 154.0 143.0 154.0 NaN 55.0 132.0 44.0 165.0 121.0 \n", "9976 ... 165.0 154.0 165.0 11.0 66.0 143.0 55.0 NaN 132.0 \n", "9977 ... NaN 165.0 NaN 22.0 77.0 154.0 66.0 NaN 143.0 \n", "9978 ... NaN NaN NaN 33.0 88.0 165.0 77.0 11.0 154.0 \n", "9979 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n", "9980 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n", "9981 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n", "9982 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n", "9983 ... 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 \n", "9984 ... 66.0 55.0 66.0 99.0 154.0 44.0 143.0 77.0 33.0 \n", "9985 ... 77.0 66.0 77.0 110.0 165.0 55.0 154.0 88.0 44.0 \n", "9986 ... 88.0 77.0 88.0 121.0 NaN 66.0 165.0 99.0 55.0 \n", "9987 ... 99.0 88.0 99.0 132.0 NaN 77.0 NaN 110.0 66.0 \n", "9988 ... 110.0 99.0 110.0 143.0 11.0 88.0 NaN 121.0 77.0 \n", "9989 ... 121.0 110.0 121.0 154.0 22.0 99.0 11.0 132.0 88.0 \n", "9990 ... 132.0 121.0 132.0 165.0 33.0 110.0 22.0 143.0 99.0 \n", "9991 ... 143.0 132.0 143.0 NaN 44.0 121.0 33.0 154.0 110.0 \n", "9992 ... 154.0 143.0 154.0 NaN 55.0 132.0 44.0 165.0 121.0 \n", "9993 ... 165.0 154.0 165.0 11.0 66.0 143.0 55.0 NaN 132.0 \n", "9994 ... NaN 165.0 NaN 22.0 77.0 154.0 66.0 NaN 143.0 \n", "9995 ... NaN NaN NaN 33.0 88.0 165.0 77.0 11.0 154.0 \n", "9996 ... 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 \n", "9997 ... 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN \n", "9998 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN \n", "9999 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 \n", "\n", " Z \n", "0 143.0 \n", "1 154.0 \n", "2 165.0 \n", "3 NaN \n", "4 NaN \n", "5 11.0 \n", "6 22.0 \n", "7 33.0 \n", "8 44.0 \n", "9 55.0 \n", "10 66.0 \n", "11 77.0 \n", "12 88.0 \n", "13 99.0 \n", "14 110.0 \n", "15 121.0 \n", "16 132.0 \n", "17 143.0 \n", "18 154.0 \n", "19 165.0 \n", "20 NaN \n", "21 NaN \n", "22 11.0 \n", "23 22.0 \n", "24 33.0 \n", "25 44.0 \n", "26 55.0 \n", "27 66.0 \n", "28 77.0 \n", "29 88.0 \n", "... ... \n", "9970 44.0 \n", "9971 55.0 \n", "9972 66.0 \n", "9973 77.0 \n", "9974 88.0 \n", "9975 99.0 \n", "9976 110.0 \n", "9977 121.0 \n", "9978 132.0 \n", "9979 143.0 \n", "9980 154.0 \n", "9981 165.0 \n", "9982 NaN \n", "9983 NaN \n", "9984 11.0 \n", "9985 22.0 \n", "9986 33.0 \n", "9987 44.0 \n", "9988 55.0 \n", "9989 66.0 \n", "9990 77.0 \n", "9991 88.0 \n", "9992 99.0 \n", "9993 110.0 \n", "9994 121.0 \n", "9995 132.0 \n", "9996 143.0 \n", "9997 154.0 \n", "9998 165.0 \n", "9999 NaN \n", "\n", "[10000 rows x 27 columns]" ] }, "execution_count": 124, "metadata": {}, "output_type": "execute_result" } ], "source": [ "much_data = np.fromfunction(lambda x,y: (x+y*y)%17*11, (10000, 26))\n", "large_df = pd.DataFrame(much_data, columns=list(\"ABCDEFGHIJKLMNOPQRSTUVWXYZ\"))\n", "large_df[large_df % 16 == 0] = np.nan\n", "large_df.insert(3,\"some_text\", \"Blabla\")\n", "large_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `head()` method returns the top 5 rows:" ] }, { "cell_type": "code", "execution_count": 125, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCsome_textDEFGHI...QRSTUVWXYZ
0NaN11.044.0Blabla99.0NaN88.022.0165.0143.0...11.0NaN11.044.099.0NaN88.022.0165.0143.0
111.022.055.0Blabla110.0NaN99.033.0NaN154.0...22.011.022.055.0110.0NaN99.033.0NaN154.0
222.033.066.0Blabla121.011.0110.044.0NaN165.0...33.022.033.066.0121.011.0110.044.0NaN165.0
333.044.077.0Blabla132.022.0121.055.011.0NaN...44.033.044.077.0132.022.0121.055.011.0NaN
444.055.088.0Blabla143.033.0132.066.022.0NaN...55.044.055.088.0143.033.0132.066.022.0NaN
\n", "

5 rows × 27 columns

\n", "
" ], "text/plain": [ " A B C some_text D E F G H I ... \\\n", "0 NaN 11.0 44.0 Blabla 99.0 NaN 88.0 22.0 165.0 143.0 ... \n", "1 11.0 22.0 55.0 Blabla 110.0 NaN 99.0 33.0 NaN 154.0 ... \n", "2 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 ... \n", "3 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN ... \n", "4 44.0 55.0 88.0 Blabla 143.0 33.0 132.0 66.0 22.0 NaN ... \n", "\n", " Q R S T U V W X Y Z \n", "0 11.0 NaN 11.0 44.0 99.0 NaN 88.0 22.0 165.0 143.0 \n", "1 22.0 11.0 22.0 55.0 110.0 NaN 99.0 33.0 NaN 154.0 \n", "2 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN 165.0 \n", "3 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 NaN \n", "4 55.0 44.0 55.0 88.0 143.0 33.0 132.0 66.0 22.0 NaN \n", "\n", "[5 rows x 27 columns]" ] }, "execution_count": 125, "metadata": {}, "output_type": "execute_result" } ], "source": [ "large_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course there's also a `tail()` function to view the bottom 5 rows. You can pass the number of rows you want:" ] }, { "cell_type": "code", "execution_count": 126, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCsome_textDEFGHI...QRSTUVWXYZ
999822.033.066.0Blabla121.011.0110.044.0NaN165.0...33.022.033.066.0121.011.0110.044.0NaN165.0
999933.044.077.0Blabla132.022.0121.055.011.0NaN...44.033.044.077.0132.022.0121.055.011.0NaN
\n", "

2 rows × 27 columns

\n", "
" ], "text/plain": [ " A B C some_text D E F G H I \\\n", "9998 22.0 33.0 66.0 Blabla 121.0 11.0 110.0 44.0 NaN 165.0 \n", "9999 33.0 44.0 77.0 Blabla 132.0 22.0 121.0 55.0 11.0 NaN \n", "\n", " ... Q R S T U V W X Y Z \n", "9998 ... 33.0 22.0 33.0 66.0 121.0 11.0 110.0 44.0 NaN 165.0 \n", "9999 ... 44.0 33.0 44.0 77.0 132.0 22.0 121.0 55.0 11.0 NaN \n", "\n", "[2 rows x 27 columns]" ] }, "execution_count": 126, "metadata": {}, "output_type": "execute_result" } ], "source": [ "large_df.tail(n=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `info()` method prints out a summary of each columns contents:" ] }, { "cell_type": "code", "execution_count": 127, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "RangeIndex: 10000 entries, 0 to 9999\n", "Data columns (total 27 columns):\n", "A 8823 non-null float64\n", "B 8824 non-null float64\n", "C 8824 non-null float64\n", "some_text 10000 non-null object\n", "D 8824 non-null float64\n", "E 8822 non-null float64\n", "F 8824 non-null float64\n", "G 8824 non-null float64\n", "H 8822 non-null float64\n", "I 8823 non-null float64\n", "J 8823 non-null float64\n", "K 8822 non-null float64\n", "L 8824 non-null float64\n", "M 8824 non-null float64\n", "N 8822 non-null float64\n", "O 8824 non-null float64\n", "P 8824 non-null float64\n", "Q 8824 non-null float64\n", "R 8823 non-null float64\n", "S 8824 non-null float64\n", "T 8824 non-null float64\n", "U 8824 non-null float64\n", "V 8822 non-null float64\n", "W 8824 non-null float64\n", "X 8824 non-null float64\n", "Y 8822 non-null float64\n", "Z 8823 non-null float64\n", "dtypes: float64(26), object(1)\n", "memory usage: 2.1+ MB\n" ] } ], "source": [ "large_df.info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, the `describe()` method gives a nice overview of the main aggregated values over each column:\n", "* `count`: number of non-null (not NaN) values\n", "* `mean`: mean of non-null values\n", "* `std`: [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) of non-null values\n", "* `min`: minimum of non-null values\n", "* `25%`, `50%`, `75%`: 25th, 50th and 75th [percentile](https://en.wikipedia.org/wiki/Percentile) of non-null values\n", "* `max`: maximum of non-null values" ] }, { "cell_type": "code", "execution_count": 128, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABCDEFGHIJ...QRSTUVWXYZ
count8823.0000008824.0000008824.0000008824.0000008822.0000008824.0000008824.0000008822.0000008823.0000008823.000000...8824.0000008823.0000008824.0000008824.0000008824.0000008822.0000008824.0000008824.0000008822.0000008823.000000
mean87.97755987.97257587.98753488.01246687.98379188.00748087.97756188.00000088.02244188.022441...87.97257587.97755987.97257587.98753488.01246687.98379188.00748087.97756188.00000088.022441
std47.53591147.53552347.52167947.52167947.53500147.51937147.52975547.53687947.53591147.535911...47.53552347.53591147.53552347.52167947.52167947.53500147.51937147.52975547.53687947.535911
min11.00000011.00000011.00000011.00000011.00000011.00000011.00000011.00000011.00000011.000000...11.00000011.00000011.00000011.00000011.00000011.00000011.00000011.00000011.00000011.000000
25%44.00000044.00000044.00000044.00000044.00000044.00000044.00000044.00000044.00000044.000000...44.00000044.00000044.00000044.00000044.00000044.00000044.00000044.00000044.00000044.000000
50%88.00000088.00000088.00000088.00000088.00000088.00000088.00000088.00000088.00000088.000000...88.00000088.00000088.00000088.00000088.00000088.00000088.00000088.00000088.00000088.000000
75%132.000000132.000000132.000000132.000000132.000000132.000000132.000000132.000000132.000000132.000000...132.000000132.000000132.000000132.000000132.000000132.000000132.000000132.000000132.000000132.000000
max165.000000165.000000165.000000165.000000165.000000165.000000165.000000165.000000165.000000165.000000...165.000000165.000000165.000000165.000000165.000000165.000000165.000000165.000000165.000000165.000000
\n", "

8 rows × 26 columns

\n", "
" ], "text/plain": [ " A B C D E \\\n", "count 8823.000000 8824.000000 8824.000000 8824.000000 8822.000000 \n", "mean 87.977559 87.972575 87.987534 88.012466 87.983791 \n", "std 47.535911 47.535523 47.521679 47.521679 47.535001 \n", "min 11.000000 11.000000 11.000000 11.000000 11.000000 \n", "25% 44.000000 44.000000 44.000000 44.000000 44.000000 \n", "50% 88.000000 88.000000 88.000000 88.000000 88.000000 \n", "75% 132.000000 132.000000 132.000000 132.000000 132.000000 \n", "max 165.000000 165.000000 165.000000 165.000000 165.000000 \n", "\n", " F G H I J \\\n", "count 8824.000000 8824.000000 8822.000000 8823.000000 8823.000000 \n", "mean 88.007480 87.977561 88.000000 88.022441 88.022441 \n", "std 47.519371 47.529755 47.536879 47.535911 47.535911 \n", "min 11.000000 11.000000 11.000000 11.000000 11.000000 \n", "25% 44.000000 44.000000 44.000000 44.000000 44.000000 \n", "50% 88.000000 88.000000 88.000000 88.000000 88.000000 \n", "75% 132.000000 132.000000 132.000000 132.000000 132.000000 \n", "max 165.000000 165.000000 165.000000 165.000000 165.000000 \n", "\n", " ... Q R S T \\\n", "count ... 8824.000000 8823.000000 8824.000000 8824.000000 \n", "mean ... 87.972575 87.977559 87.972575 87.987534 \n", "std ... 47.535523 47.535911 47.535523 47.521679 \n", "min ... 11.000000 11.000000 11.000000 11.000000 \n", "25% ... 44.000000 44.000000 44.000000 44.000000 \n", "50% ... 88.000000 88.000000 88.000000 88.000000 \n", "75% ... 132.000000 132.000000 132.000000 132.000000 \n", "max ... 165.000000 165.000000 165.000000 165.000000 \n", "\n", " U V W X Y \\\n", "count 8824.000000 8822.000000 8824.000000 8824.000000 8822.000000 \n", "mean 88.012466 87.983791 88.007480 87.977561 88.000000 \n", "std 47.521679 47.535001 47.519371 47.529755 47.536879 \n", "min 11.000000 11.000000 11.000000 11.000000 11.000000 \n", "25% 44.000000 44.000000 44.000000 44.000000 44.000000 \n", "50% 88.000000 88.000000 88.000000 88.000000 88.000000 \n", "75% 132.000000 132.000000 132.000000 132.000000 132.000000 \n", "max 165.000000 165.000000 165.000000 165.000000 165.000000 \n", "\n", " Z \n", "count 8823.000000 \n", "mean 88.022441 \n", "std 47.535911 \n", "min 11.000000 \n", "25% 44.000000 \n", "50% 88.000000 \n", "75% 132.000000 \n", "max 165.000000 \n", "\n", "[8 rows x 26 columns]" ] }, "execution_count": 128, "metadata": {}, "output_type": "execute_result" } ], "source": [ "large_df.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Saving & loading\n", "Pandas can save `DataFrame`s to various backends, including file formats such as CSV, Excel, JSON, HTML and HDF5, or to a SQL database. Let's create a `DataFrame` to demonstrate this:" ] }, { "cell_type": "code", "execution_count": 129, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyweightbirthyearchildren
aliceBiking68.51985NaN
bobDancing83.119843.0
\n", "
" ], "text/plain": [ " hobby weight birthyear children\n", "alice Biking 68.5 1985 NaN\n", "bob Dancing 83.1 1984 3.0" ] }, "execution_count": 129, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_df = pd.DataFrame(\n", " [[\"Biking\", 68.5, 1985, np.nan], [\"Dancing\", 83.1, 1984, 3]], \n", " columns=[\"hobby\",\"weight\",\"birthyear\",\"children\"],\n", " index=[\"alice\", \"bob\"]\n", ")\n", "my_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving\n", "Let's save it to CSV, HTML and JSON:" ] }, { "cell_type": "code", "execution_count": 130, "metadata": {}, "outputs": [], "source": [ "my_df.to_csv(\"my_df.csv\")\n", "my_df.to_html(\"my_df.html\")\n", "my_df.to_json(\"my_df.json\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Done! Let's take a peek at what was saved:" ] }, { "cell_type": "code", "execution_count": 131, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# my_df.csv\n", ",hobby,weight,birthyear,children\n", "alice,Biking,68.5,1985,\n", "bob,Dancing,83.1,1984,3.0\n", "\n", "\n", "# my_df.html\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyweightbirthyearchildren
aliceBiking68.51985NaN
bobDancing83.119843.0
\n", "\n", "# my_df.json\n", "{\"hobby\":{\"alice\":\"Biking\",\"bob\":\"Dancing\"},\"weight\":{\"alice\":68.5,\"bob\":83.1},\"birthyear\":{\"alice\":1985,\"bob\":1984},\"children\":{\"alice\":null,\"bob\":3.0}}\n", "\n" ] } ], "source": [ "for filename in (\"my_df.csv\", \"my_df.html\", \"my_df.json\"):\n", " print(\"#\", filename)\n", " with open(filename, \"rt\") as f:\n", " print(f.read())\n", " print()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the index is saved as the first column (with no name) in a CSV file, as `` tags in HTML and as keys in JSON.\n", "\n", "Saving to other formats works very similarly, but some formats require extra libraries to be installed. For example, saving to Excel requires the openpyxl library:" ] }, { "cell_type": "code", "execution_count": 132, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "No module named 'openpyxl'\n" ] } ], "source": [ "try:\n", " my_df.to_excel(\"my_df.xlsx\", sheet_name='People')\n", "except ImportError as e:\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading\n", "Now let's load our CSV file back into a `DataFrame`:" ] }, { "cell_type": "code", "execution_count": 133, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hobbyweightbirthyearchildren
aliceBiking68.51985NaN
bobDancing83.119843.0
\n", "
" ], "text/plain": [ " hobby weight birthyear children\n", "alice Biking 68.5 1985 NaN\n", "bob Dancing 83.1 1984 3.0" ] }, "execution_count": 133, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_df_loaded = pd.read_csv(\"my_df.csv\", index_col=0)\n", "my_df_loaded" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you might guess, there are similar `read_json`, `read_html`, `read_excel` functions as well. We can also read data straight from the Internet. For example, let's load the top 1,000 U.S. cities from github:" ] }, { "cell_type": "code", "execution_count": 134, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StatePopulationlatlon
City
MarysvilleWashington6326948.051764-122.177082
PerrisCalifornia7232633.782519-117.228648
ClevelandOhio39011341.499320-81.694361
WorcesterMassachusetts18254442.262593-71.802293
ColumbiaSouth Carolina13335834.000710-81.034814
\n", "
" ], "text/plain": [ " State Population lat lon\n", "City \n", "Marysville Washington 63269 48.051764 -122.177082\n", "Perris California 72326 33.782519 -117.228648\n", "Cleveland Ohio 390113 41.499320 -81.694361\n", "Worcester Massachusetts 182544 42.262593 -71.802293\n", "Columbia South Carolina 133358 34.000710 -81.034814" ] }, "execution_count": 134, "metadata": {}, "output_type": "execute_result" } ], "source": [ "us_cities = None\n", "try:\n", " csv_url = \"https://raw.githubusercontent.com/plotly/datasets/master/us-cities-top-1k.csv\"\n", " us_cities = pd.read_csv(csv_url, index_col=0)\n", " us_cities = us_cities.head()\n", "except IOError as e:\n", " print(e)\n", "us_cities" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are more options available, in particular regarding datetime format. Check out the [documentation](http://pandas.pydata.org/pandas-docs/stable/io.html) for more details." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Combining `DataFrame`s\n", "\n", "## SQL-like joins\n", "One powerful feature of pandas is it's ability to perform SQL-like joins on `DataFrame`s. Various types of joins are supported: inner joins, left/right outer joins and full joins. To illustrate this, let's start by creating a couple simple `DataFrame`s:" ] }, { "cell_type": "code", "execution_count": 135, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
statecitylatlng
0CASan Francisco37.781334-122.416728
1NYNew York40.705649-74.008344
2FLMiami25.791100-80.320733
3OHCleveland41.473508-81.739791
4UTSalt Lake City40.755851-111.896657
\n", "
" ], "text/plain": [ " state city lat lng\n", "0 CA San Francisco 37.781334 -122.416728\n", "1 NY New York 40.705649 -74.008344\n", "2 FL Miami 25.791100 -80.320733\n", "3 OH Cleveland 41.473508 -81.739791\n", "4 UT Salt Lake City 40.755851 -111.896657" ] }, "execution_count": 135, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_loc = pd.DataFrame(\n", " [\n", " [\"CA\", \"San Francisco\", 37.781334, -122.416728],\n", " [\"NY\", \"New York\", 40.705649, -74.008344],\n", " [\"FL\", \"Miami\", 25.791100, -80.320733],\n", " [\"OH\", \"Cleveland\", 41.473508, -81.739791],\n", " [\"UT\", \"Salt Lake City\", 40.755851, -111.896657]\n", " ], columns=[\"state\", \"city\", \"lat\", \"lng\"])\n", "city_loc" ] }, { "cell_type": "code", "execution_count": 136, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
populationcitystate
3808976San FranciscoCalifornia
48363710New YorkNew-York
5413201MiamiFlorida
62242193HoustonTexas
\n", "
" ], "text/plain": [ " population city state\n", "3 808976 San Francisco California\n", "4 8363710 New York New-York\n", "5 413201 Miami Florida\n", "6 2242193 Houston Texas" ] }, "execution_count": 136, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_pop = pd.DataFrame(\n", " [\n", " [808976, \"San Francisco\", \"California\"],\n", " [8363710, \"New York\", \"New-York\"],\n", " [413201, \"Miami\", \"Florida\"],\n", " [2242193, \"Houston\", \"Texas\"]\n", " ], index=[3,4,5,6], columns=[\"population\", \"city\", \"state\"])\n", "city_pop" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's join these `DataFrame`s using the `merge()` function:" ] }, { "cell_type": "code", "execution_count": 137, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
state_xcitylatlngpopulationstate_y
0CASan Francisco37.781334-122.416728808976California
1NYNew York40.705649-74.0083448363710New-York
2FLMiami25.791100-80.320733413201Florida
\n", "
" ], "text/plain": [ " state_x city lat lng population state_y\n", "0 CA San Francisco 37.781334 -122.416728 808976 California\n", "1 NY New York 40.705649 -74.008344 8363710 New-York\n", "2 FL Miami 25.791100 -80.320733 413201 Florida" ] }, "execution_count": 137, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.merge(left=city_loc, right=city_pop, on=\"city\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that both `DataFrame`s have a column named `state`, so in the result they got renamed to `state_x` and `state_y`.\n", "\n", "Also, note that Cleveland, Salt Lake City and Houston were dropped because they don't exist in *both* `DataFrame`s. This is the equivalent of a SQL `INNER JOIN`. If you want a `FULL OUTER JOIN`, where no city gets dropped and `NaN` values are added, you must specify `how=\"outer\"`:" ] }, { "cell_type": "code", "execution_count": 138, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
state_xcitylatlngpopulationstate_y
0CASan Francisco37.781334-122.416728808976.0California
1NYNew York40.705649-74.0083448363710.0New-York
2FLMiami25.791100-80.320733413201.0Florida
3OHCleveland41.473508-81.739791NaNNaN
4UTSalt Lake City40.755851-111.896657NaNNaN
5NaNHoustonNaNNaN2242193.0Texas
\n", "
" ], "text/plain": [ " state_x city lat lng population state_y\n", "0 CA San Francisco 37.781334 -122.416728 808976.0 California\n", "1 NY New York 40.705649 -74.008344 8363710.0 New-York\n", "2 FL Miami 25.791100 -80.320733 413201.0 Florida\n", "3 OH Cleveland 41.473508 -81.739791 NaN NaN\n", "4 UT Salt Lake City 40.755851 -111.896657 NaN NaN\n", "5 NaN Houston NaN NaN 2242193.0 Texas" ] }, "execution_count": 138, "metadata": {}, "output_type": "execute_result" } ], "source": [ "all_cities = pd.merge(left=city_loc, right=city_pop, on=\"city\", how=\"outer\")\n", "all_cities" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course `LEFT OUTER JOIN` is also available by setting `how=\"left\"`: only the cities present in the left `DataFrame` end up in the result. Similarly, with `how=\"right\"` only cities in the right `DataFrame` appear in the result. For example:" ] }, { "cell_type": "code", "execution_count": 139, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
state_xcitylatlngpopulationstate_y
0CASan Francisco37.781334-122.416728808976California
1NYNew York40.705649-74.0083448363710New-York
2FLMiami25.791100-80.320733413201Florida
3NaNHoustonNaNNaN2242193Texas
\n", "
" ], "text/plain": [ " state_x city lat lng population state_y\n", "0 CA San Francisco 37.781334 -122.416728 808976 California\n", "1 NY New York 40.705649 -74.008344 8363710 New-York\n", "2 FL Miami 25.791100 -80.320733 413201 Florida\n", "3 NaN Houston NaN NaN 2242193 Texas" ] }, "execution_count": 139, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.merge(left=city_loc, right=city_pop, on=\"city\", how=\"right\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the key to join on is actually in one (or both) `DataFrame`'s index, you must use `left_index=True` and/or `right_index=True`. If the key column names differ, you must use `left_on` and `right_on`. For example:" ] }, { "cell_type": "code", "execution_count": 140, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
state_xcitylatlngpopulationnamestate_y
0CASan Francisco37.781334-122.416728808976San FranciscoCalifornia
1NYNew York40.705649-74.0083448363710New YorkNew-York
2FLMiami25.791100-80.320733413201MiamiFlorida
\n", "
" ], "text/plain": [ " state_x city lat lng population name \\\n", "0 CA San Francisco 37.781334 -122.416728 808976 San Francisco \n", "1 NY New York 40.705649 -74.008344 8363710 New York \n", "2 FL Miami 25.791100 -80.320733 413201 Miami \n", "\n", " state_y \n", "0 California \n", "1 New-York \n", "2 Florida " ] }, "execution_count": 140, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_pop2 = city_pop.copy()\n", "city_pop2.columns = [\"population\", \"name\", \"state\"]\n", "pd.merge(left=city_loc, right=city_pop2, left_on=\"city\", right_on=\"name\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Concatenation\n", "Rather than joining `DataFrame`s, we may just want to concatenate them. That's what `concat()` is for:" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
citylatlngpopulationstate
0San Francisco37.781334-122.416728NaNCA
1New York40.705649-74.008344NaNNY
2Miami25.791100-80.320733NaNFL
3Cleveland41.473508-81.739791NaNOH
4Salt Lake City40.755851-111.896657NaNUT
3San FranciscoNaNNaN808976.0California
4New YorkNaNNaN8363710.0New-York
5MiamiNaNNaN413201.0Florida
6HoustonNaNNaN2242193.0Texas
\n", "
" ], "text/plain": [ " city lat lng population state\n", "0 San Francisco 37.781334 -122.416728 NaN CA\n", "1 New York 40.705649 -74.008344 NaN NY\n", "2 Miami 25.791100 -80.320733 NaN FL\n", "3 Cleveland 41.473508 -81.739791 NaN OH\n", "4 Salt Lake City 40.755851 -111.896657 NaN UT\n", "3 San Francisco NaN NaN 808976.0 California\n", "4 New York NaN NaN 8363710.0 New-York\n", "5 Miami NaN NaN 413201.0 Florida\n", "6 Houston NaN NaN 2242193.0 Texas" ] }, "execution_count": 141, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result_concat = pd.concat([city_loc, city_pop])\n", "result_concat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that this operation aligned the data horizontally (by columns) but not vertically (by rows). In this example, we end up with multiple rows having the same index (eg. 3). Pandas handles this rather gracefully:" ] }, { "cell_type": "code", "execution_count": 142, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
citylatlngpopulationstate
3Cleveland41.473508-81.739791NaNOH
3San FranciscoNaNNaN808976.0California
\n", "
" ], "text/plain": [ " city lat lng population state\n", "3 Cleveland 41.473508 -81.739791 NaN OH\n", "3 San Francisco NaN NaN 808976.0 California" ] }, "execution_count": 142, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result_concat.loc[3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or you can tell pandas to just ignore the index:" ] }, { "cell_type": "code", "execution_count": 143, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
citylatlngpopulationstate
0San Francisco37.781334-122.416728NaNCA
1New York40.705649-74.008344NaNNY
2Miami25.791100-80.320733NaNFL
3Cleveland41.473508-81.739791NaNOH
4Salt Lake City40.755851-111.896657NaNUT
5San FranciscoNaNNaN808976.0California
6New YorkNaNNaN8363710.0New-York
7MiamiNaNNaN413201.0Florida
8HoustonNaNNaN2242193.0Texas
\n", "
" ], "text/plain": [ " city lat lng population state\n", "0 San Francisco 37.781334 -122.416728 NaN CA\n", "1 New York 40.705649 -74.008344 NaN NY\n", "2 Miami 25.791100 -80.320733 NaN FL\n", "3 Cleveland 41.473508 -81.739791 NaN OH\n", "4 Salt Lake City 40.755851 -111.896657 NaN UT\n", "5 San Francisco NaN NaN 808976.0 California\n", "6 New York NaN NaN 8363710.0 New-York\n", "7 Miami NaN NaN 413201.0 Florida\n", "8 Houston NaN NaN 2242193.0 Texas" ] }, "execution_count": 143, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([city_loc, city_pop], ignore_index=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that when a column does not exist in a `DataFrame`, it acts as if it was filled with `NaN` values. If we set `join=\"inner\"`, then only columns that exist in *both* `DataFrame`s are returned:" ] }, { "cell_type": "code", "execution_count": 144, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
statecity
0CASan Francisco
1NYNew York
2FLMiami
3OHCleveland
4UTSalt Lake City
3CaliforniaSan Francisco
4New-YorkNew York
5FloridaMiami
6TexasHouston
\n", "
" ], "text/plain": [ " state city\n", "0 CA San Francisco\n", "1 NY New York\n", "2 FL Miami\n", "3 OH Cleveland\n", "4 UT Salt Lake City\n", "3 California San Francisco\n", "4 New-York New York\n", "5 Florida Miami\n", "6 Texas Houston" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([city_loc, city_pop], join=\"inner\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can concatenate `DataFrame`s horizontally instead of vertically by setting `axis=1`:" ] }, { "cell_type": "code", "execution_count": 145, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
statecitylatlngpopulationcitystate
0CASan Francisco37.781334-122.416728NaNNaNNaN
1NYNew York40.705649-74.008344NaNNaNNaN
2FLMiami25.791100-80.320733NaNNaNNaN
3OHCleveland41.473508-81.739791808976.0San FranciscoCalifornia
4UTSalt Lake City40.755851-111.8966578363710.0New YorkNew-York
5NaNNaNNaNNaN413201.0MiamiFlorida
6NaNNaNNaNNaN2242193.0HoustonTexas
\n", "
" ], "text/plain": [ " state city lat lng population city \\\n", "0 CA San Francisco 37.781334 -122.416728 NaN NaN \n", "1 NY New York 40.705649 -74.008344 NaN NaN \n", "2 FL Miami 25.791100 -80.320733 NaN NaN \n", "3 OH Cleveland 41.473508 -81.739791 808976.0 San Francisco \n", "4 UT Salt Lake City 40.755851 -111.896657 8363710.0 New York \n", "5 NaN NaN NaN NaN 413201.0 Miami \n", "6 NaN NaN NaN NaN 2242193.0 Houston \n", "\n", " state \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 California \n", "4 New-York \n", "5 Florida \n", "6 Texas " ] }, "execution_count": 145, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([city_loc, city_pop], axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case it really does not make much sense because the indices do not align well (eg. Cleveland and San Francisco end up on the same row, because they shared the index label `3`). So let's reindex the `DataFrame`s by city name before concatenating:" ] }, { "cell_type": "code", "execution_count": 146, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
statelatlngpopulationstate
ClevelandOH41.473508-81.739791NaNNaN
HoustonNaNNaNNaN2242193.0Texas
MiamiFL25.791100-80.320733413201.0Florida
New YorkNY40.705649-74.0083448363710.0New-York
Salt Lake CityUT40.755851-111.896657NaNNaN
San FranciscoCA37.781334-122.416728808976.0California
\n", "
" ], "text/plain": [ " state lat lng population state\n", "Cleveland OH 41.473508 -81.739791 NaN NaN\n", "Houston NaN NaN NaN 2242193.0 Texas\n", "Miami FL 25.791100 -80.320733 413201.0 Florida\n", "New York NY 40.705649 -74.008344 8363710.0 New-York\n", "Salt Lake City UT 40.755851 -111.896657 NaN NaN\n", "San Francisco CA 37.781334 -122.416728 808976.0 California" ] }, "execution_count": 146, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.concat([city_loc.set_index(\"city\"), city_pop.set_index(\"city\")], axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This looks a lot like a `FULL OUTER JOIN`, except that the `state` columns were not renamed to `state_x` and `state_y`, and the `city` column is now the index." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `append()` method is a useful shorthand for concatenating `DataFrame`s vertically:" ] }, { "cell_type": "code", "execution_count": 147, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
citylatlngpopulationstate
0San Francisco37.781334-122.416728NaNCA
1New York40.705649-74.008344NaNNY
2Miami25.791100-80.320733NaNFL
3Cleveland41.473508-81.739791NaNOH
4Salt Lake City40.755851-111.896657NaNUT
3San FranciscoNaNNaN808976.0California
4New YorkNaNNaN8363710.0New-York
5MiamiNaNNaN413201.0Florida
6HoustonNaNNaN2242193.0Texas
\n", "
" ], "text/plain": [ " city lat lng population state\n", "0 San Francisco 37.781334 -122.416728 NaN CA\n", "1 New York 40.705649 -74.008344 NaN NY\n", "2 Miami 25.791100 -80.320733 NaN FL\n", "3 Cleveland 41.473508 -81.739791 NaN OH\n", "4 Salt Lake City 40.755851 -111.896657 NaN UT\n", "3 San Francisco NaN NaN 808976.0 California\n", "4 New York NaN NaN 8363710.0 New-York\n", "5 Miami NaN NaN 413201.0 Florida\n", "6 Houston NaN NaN 2242193.0 Texas" ] }, "execution_count": 147, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_loc.append(city_pop)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As always in pandas, the `append()` method does *not* actually modify `city_loc`: it works on a copy and returns the modified copy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Categories\n", "It is quite frequent to have values that represent categories, for example `1` for female and `2` for male, or `\"A\"` for Good, `\"B\"` for Average, `\"C\"` for Bad. These categorical values can be hard to read and cumbersome to handle, but fortunately pandas makes it easy. To illustrate this, let's take the `city_pop` `DataFrame` we created earlier, and add a column that represents a category:" ] }, { "cell_type": "code", "execution_count": 148, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
populationcitystateeco_code
3808976San FranciscoCalifornia17
48363710New YorkNew-York17
5413201MiamiFlorida34
62242193HoustonTexas20
\n", "
" ], "text/plain": [ " population city state eco_code\n", "3 808976 San Francisco California 17\n", "4 8363710 New York New-York 17\n", "5 413201 Miami Florida 34\n", "6 2242193 Houston Texas 20" ] }, "execution_count": 148, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_eco = city_pop.copy()\n", "city_eco[\"eco_code\"] = [17, 17, 34, 20]\n", "city_eco" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Right now the `eco_code` column is full of apparently meaningless codes. Let's fix that. First, we will create a new categorical column based on the `eco_code`s:" ] }, { "cell_type": "code", "execution_count": 149, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Int64Index([17, 20, 34], dtype='int64')" ] }, "execution_count": 149, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_eco[\"economy\"] = city_eco[\"eco_code\"].astype('category')\n", "city_eco[\"economy\"].cat.categories" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can give each category a meaningful name:" ] }, { "cell_type": "code", "execution_count": 150, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
populationcitystateeco_codeeconomy
3808976San FranciscoCalifornia17Finance
48363710New YorkNew-York17Finance
5413201MiamiFlorida34Tourism
62242193HoustonTexas20Energy
\n", "
" ], "text/plain": [ " population city state eco_code economy\n", "3 808976 San Francisco California 17 Finance\n", "4 8363710 New York New-York 17 Finance\n", "5 413201 Miami Florida 34 Tourism\n", "6 2242193 Houston Texas 20 Energy" ] }, "execution_count": 150, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_eco[\"economy\"].cat.categories = [\"Finance\", \"Energy\", \"Tourism\"]\n", "city_eco" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that categorical values are sorted according to their categorical order, *not* their alphabetical order:" ] }, { "cell_type": "code", "execution_count": 151, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
populationcitystateeco_codeeconomy
5413201MiamiFlorida34Tourism
62242193HoustonTexas20Energy
48363710New YorkNew-York17Finance
3808976San FranciscoCalifornia17Finance
\n", "
" ], "text/plain": [ " population city state eco_code economy\n", "5 413201 Miami Florida 34 Tourism\n", "6 2242193 Houston Texas 20 Energy\n", "4 8363710 New York New-York 17 Finance\n", "3 808976 San Francisco California 17 Finance" ] }, "execution_count": 151, "metadata": {}, "output_type": "execute_result" } ], "source": [ "city_eco.sort_values(by=\"economy\", ascending=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What next?\n", "As you probably noticed by now, pandas is quite a large library with *many* features. Although we went through the most important features, there is still a lot to discover. Probably the best way to learn more is to get your hands dirty with some real-life data. It is also a good idea to go through pandas' excellent [documentation](http://pandas.pydata.org/pandas-docs/stable/index.html), in particular the [Cookbook](http://pandas.pydata.org/pandas-docs/stable/cookbook.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" }, "toc": { "toc_cell": false, "toc_number_sections": true, "toc_section_display": "none", "toc_threshold": 6, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 4 }