{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Two Handy Tricks\n", "\n", "- [Download the lecture notes](https://philchodrow.github.io/PIC16A/content/pd/pd_5.ipynb). \n", "\n", "In this lecture, we'll go over a few neat tricks with `pandas` data frames. These aren't things that are always necessary when analyzing data, but can often make your life significantly easier. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SpeciesRegionIslandCulmen Length (mm)Culmen Depth (mm)Sex
0AdelieAnversTorgersen39.118.7MALE
1AdelieAnversTorgersen39.517.4FEMALE
2AdelieAnversTorgersen40.318.0FEMALE
3AdelieAnversTorgersenNaNNaNNaN
4AdelieAnversTorgersen36.719.3FEMALE
\n", "
" ], "text/plain": [ " Species Region Island Culmen Length (mm) Culmen Depth (mm) Sex\n", "0 Adelie Anvers Torgersen 39.1 18.7 MALE\n", "1 Adelie Anvers Torgersen 39.5 17.4 FEMALE\n", "2 Adelie Anvers Torgersen 40.3 18.0 FEMALE\n", "3 Adelie Anvers Torgersen NaN NaN NaN\n", "4 Adelie Anvers Torgersen 36.7 19.3 FEMALE" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "penguins = pd.read_csv(\"palmer_penguins.csv\")\n", "\n", "cols = [\"Species\", \"Region\", \"Island\", \"Culmen Length (mm)\", \"Culmen Depth (mm)\", \"Sex\"]\n", "\n", "# select a subset of columns\n", "penguins = penguins[cols]\n", "\n", "# shorten the species name\n", "penguins[\"Species\"] = penguins[\"Species\"].str.split().str.get(0)\n", "\n", "penguins.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recoding Columns\n", "\n", "In many cases, we might want to recode the data in columns. For example, suppose I'd like to recode the data for the sex of the penguins, replacing \"MALE\" and \"FEMALE\" with \"m\" and \"f\", respectively. We'd also like to replace the `NaN` values with \"unknown.\" The most versatile way to do this is to manually specify the recoding map using a dictionary and the `map()` method. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 MALE\n", "1 FEMALE\n", "2 FEMALE\n", "3 NaN\n", "4 FEMALE\n", " ... \n", "339 NaN\n", "340 FEMALE\n", "341 MALE\n", "342 FEMALE\n", "343 MALE\n", "Name: Sex, Length: 344, dtype: object" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "penguins.Sex" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# keys are original data codes \n", "# values are new codes\n", "recode = {\n", " \"MALE\" : \"m\",\n", " \"FEMALE\" : \"f\",\n", " np.nan : \"unknown\"\n", "}\n", "\n", "penguins[\"Sex\"] = penguins[\"Sex\"].map(recode)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 m\n", "1 f\n", "2 f\n", "3 unknown\n", "4 f\n", " ... \n", "339 unknown\n", "340 f\n", "341 m\n", "342 f\n", "343 m\n", "Name: Sex, Length: 344, dtype: object" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "penguins[\"Sex\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The `apply` method\n", "\n", "Somewhat confusingly, the `apply()` method is not the method used in standard split-apply-combine operations -- `aggregate()` is your go-to. The `apply` method is used when you want to operate on groups without necessarily producing a reduced data frame. \n", "\n", "When using `apply()`, it is necessary to supply a function whose first argument is a data frame. This function will be applied to each group. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/histograms.py:839: RuntimeWarning: invalid value encountered in greater_equal\n", " keep = (tmp_a >= first_edge)\n", "/opt/anaconda3/lib/python3.8/site-packages/numpy/lib/histograms.py:840: RuntimeWarning: invalid value encountered in less_equal\n", " keep &= (tmp_a <= last_edge)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: []\n", "Index: []" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAN7ElEQVR4nO3df6xkZX3H8fdHpJGgBigXshG2Swm1NaZc7A3VEI2AmpWagn+YlKR205CsCdBgQ9KA/xT/M0112z+6pItQN60/SlQCIca62WrExKC7uCJ0FQxSWdzuLioFmkYDfPvHPZveXObuzJ05c4dn9v1KJmfOM+ee83142E/OPfeZc1JVSJLa85pZFyBJGo8BLkmNMsAlqVEGuCQ1ygCXpEa9diMPdvbZZ9eWLVs28pCS1Lz9+/c/U1ULq9s3NMC3bNnCvn37NvKQktS8JP85qN1LKJLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1KgN/SamtBF2HtjZ+z6vX7y+931Kk/IMXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1amiAJ3ldku8k+X6SR5N8vGs/K8meJI93yzOnX64k6bhRzsB/BVxRVRcDi8DWJG8HbgH2VtVFwN5uXZK0QYYGeC17oVs9tXsVcDWwu2vfDVwzlQolSQONdA08ySlJDgBHgT1V9SBwblUdBuiW50yvTEnSaiM9Uq2qXgIWk5wB3JPkraMeIMl2YDvA5s2bxyryZLZjz2Nj/+xfvvd3eqxEffKxb+rDumahVNWzwDeArcCRJJsAuuXRNX5mV1UtVdXSwsLChOVKko4bZRbKQnfmTZLTgPcAPwTuA7Z1m20D7p1WkZKkVxrlEsomYHeSU1gO/Lur6v4k3wbuTnId8FPgQ1OsU5K0ytAAr6qHgUsGtP8cuHIaRUmShvObmJLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0a6Yk8OvlM8iQg8GlA0kbwDFySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckho1NMCTnJ/k60kOJnk0yU1d+21Jnk5yoHtdNf1yJUnHjfJV+heBm6vqoSRvAPYn2dN9tqOq/nZ65UmS1jI0wKvqMHC4e/98koPAm6ZdmCTpxNZ1DTzJFuAS4MGu6cYkDye5K8mZa/zM9iT7kuw7duzYRMVKkv7fyAGe5PXAl4CPVtVzwO3AhcAiy2fonxz0c1W1q6qWqmppYWGhh5IlSTBigCc5leXw/mxVfRmgqo5U1UtV9TJwB3Dp9MqUJK02yiyUAHcCB6vqUyvaN63Y7IPAI/2XJ0layyizUC4DPgz8IMmBru1jwLVJFoECngQ+MpUKJUkDjTIL5VtABnz0lf7LkSSNykeqzbFJH4sm6dXNr9JLUqMMcElqlAEuSY0ywCWpUQa4JDXKWShT5kwQSdPiGbgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNGhrgSc5P8vUkB5M8muSmrv2sJHuSPN4tz5x+uZKk40Y5A38RuLmqfg94O3BDkrcAtwB7q+oiYG+3LknaIEMDvKoOV9VD3fvngYPAm4Crgd3dZruBa6ZVpCTpldb1SLUkW4BLgAeBc6vqMCyHfJJz1viZ7cB2gM2bN09S68z4WDTtPLBz1iVIrzDyHzGTvB74EvDRqnpu1J+rql1VtVRVSwsLC+PUKEkaYKQAT3Iqy+H92ar6ctd8JMmm7vNNwNHplChJGmSUWSgB7gQOVtWnVnx0H7Cte78NuLf/8iRJaxnlGvhlwIeBHyQ50LV9DPgEcHeS64CfAh+aTomSpEGGBnhVfQvIGh9f2W85kqRR+U1MSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIata6bWUmakZ88MHybX/73aPu6/NbJatGrhmfgktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNWpogCe5K8nRJI+saLstydNJDnSvq6ZbpiRptVHOwD8DbB3QvqOqFrvXV/otS5I0zNAAr6pvAr/YgFokSeswySPVbkzyZ8A+4Oaq+uWgjZJsB7YDbN68eYLDaR7tPLBz1iVIzRr3j5i3AxcCi8Bh4JNrbVhVu6pqqaqWFhYWxjycJGm1sQK8qo5U1UtV9TJwB3Bpv2VJkoYZK8CTbFqx+kHgkbW2lSRNx9Br4Ek+D7wbODvJIeCvgXcnWQQKeBL4yBRrlCQNMDTAq+raAc13TqEWSdI6+E1MSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1KhJ7oUirWnHnsdG2u6h537+irZ3XPibfZdzUtj57MOjbXjPoJnBg11/xu+feIPLbx15X+qfZ+CS1CgDXJIaZYBLUqMMcElqlAEuSY1yFoq00k8e6G9fF7yzv31JA3gGLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjRoa4EnuSnI0ySMr2s5KsifJ493yzOmWKUlabZQz8M8AW1e13QLsraqLgL3duiRpAw0N8Kr6JvCLVc1XA7u797uBa3quS5I0xLjXwM+tqsMA3fKctTZMsj3JviT7jh07NubhJEmrTf2PmFW1q6qWqmppYWFh2oeTpJPGuAF+JMkmgG55tL+SJEmjGDfA7wO2de+3Aff2U44kaVSjTCP8PPBt4M1JDiW5DvgE8N4kjwPv7dYlSRto6BN5quraNT66sudaJEnr4CPV1L4+H4MmNcSv0ktSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUSfNzax27Hls1iVIUq88A5ekRhngktQoA1ySGmWAS1KjDHBJatRJMwtF2nA+6k1T5hm4JDXKAJekRk10CSXJk8DzwEvAi1W11EdRkqTh+rgGfnlVPdPDfiRJ6+AlFElq1KRn4AV8LUkB/1hVu1ZvkGQ7sB1g8+bNEx5Os/bQc/866xIkdSY9A7+sqt4GvB+4Icm7Vm9QVbuqaqmqlhYWFiY8nCTpuIkCvKp+1i2PAvcAl/ZRlCRpuLEDPMnpSd5w/D3wPuCRvgqTJJ3YJNfAzwXuSXJ8P5+rqq/2UpUkaaixA7yqngAu7rEWSdI6OI1Qkhrlzay0Luc9t7+X/Rx64x/0sh/pZOYZuCQ1ygCXpEYZ4JLUKANckhplgEtSo5yFornz1LP/O/bPnn/GaT1W0r6dzz584g0O7FzX/q5fvH6CarSaZ+CS1CgDXJIaZYBLUqMMcElqlAEuSY1qZhbKjj2PzboE9ehE91R56nsbWIgm85MH1rX5zhNtf8E7xyrhZJ7Z4hm4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJalQz0wglzbl1Tkk87oRTE/sw5vTG1aYx3dEzcElqlAEuSY2aKMCTbE3yoyQ/TnJLX0VJkoYbO8CTnAL8A/B+4C3AtUne0ldhkqQTm+QM/FLgx1X1RFX9GvgCcHU/ZUmShplkFsqbgKdWrB8C/nD1Rkm2A9u71ReS/GiCY87S2cAzsy5iiua5f/atXa+C/n2hl73cwA2rm9bTt98a1DhJgGdAW72ioWoXsGuC47wqJNlXVUuzrmNa5rl/9q1d89y/Pvo2ySWUQ8D5K9bPA342STGSpNFNEuDfBS5KckGS3wD+BLivn7IkScOMfQmlql5MciPwb8ApwF1V9Whvlb36NH8ZaIh57p99a9c892/ivqXqFZetJUkN8JuYktQoA1ySGmWAD5DkdUm+k+T7SR5N8vGu/bYkTyc50L2umnWt40pySpLvJbm/Wz8ryZ4kj3fLM2dd47gG9G2exu3JJD/o+rGva5uLsVujb3MxdknOSPLFJD9McjDJO/oYNwN8sF8BV1TVxcAisDXJ27vPdlTVYvf6yuxKnNhNwMEV67cAe6vqImBvt96q1X2D+Rk3gMu7fhyfQzxPY7e6bzAfY/f3wFer6neBi1n+/3PicTPAB6hlL3Srp3avuflrb5LzgD8CPr2i+Wpgd/d+N3DNRtfVhzX6Nu/mYuzmVZI3Au8C7gSoql9X1bP0MG4G+Bq6X8MPAEeBPVX1YPfRjUkeTnJXq7+qAn8H/BXw8oq2c6vqMEC3PGcWhfVgUN9gPsYNlk8kvpZkf3ebCpifsRvUN2h/7H4bOAb8U3dp79NJTqeHcTPA11BVL1XVIsvfML00yVuB24ELWb6schj45AxLHEuSDwBHq2r/rGvp2wn61vy4rXBZVb2N5buA3pDkXbMuqEeD+jYPY/da4G3A7VV1CfA/9HSZywAfovtV5xvA1qo60gX7y8AdLN+RsTWXAX+c5EmW79JzRZJ/AY4k2QTQLY/OrsSxDezbnIwbAFX1s255FLiH5b7Mw9gN7NucjN0h4NCK3+K/yHKgTzxuBvgASRaSnNG9Pw14D/DD4/+xOx8EHplFfZOoqlur6ryq2sLy7Q/+var+lOXbIGzrNtsG3DujEse2Vt/mYdwAkpye5A3H3wPvY7kvzY/dWn2bh7Grqv8Cnkry5q7pSuA/6GHcfKjxYJuA3d1DK14D3F1V9yf55ySLLF+rexL4yAxr7NsngLuTXAf8FPjQjOvp09/MybidC9yTBJb/7X6uqr6a5Lu0P3Zr9W1e/s39BfDZ7r5RTwB/Tpctk4ybX6WXpEZ5CUWSGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEb9H9/g0AKjfcZjAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(1)\n", "\n", "def plot_hist(df):\n", " ax.hist(df['Culmen Length (mm)'], alpha = 0.5)\n", "\n", "penguins.groupby(\"Species\").apply(plot_hist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Further arguments can be specified as `*args` or `**kwargs` to the call to `apply`. " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: []\n", "Index: []" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAN7ElEQVR4nO3df6xkZX3H8fdHpJGgBigXshG2Swm1NaZc7A3VEI2AmpWagn+YlKR205CsCdBgQ9KA/xT/M0112z+6pItQN60/SlQCIca62WrExKC7uCJ0FQxSWdzuLioFmkYDfPvHPZveXObuzJ05c4dn9v1KJmfOM+ee83142E/OPfeZc1JVSJLa85pZFyBJGo8BLkmNMsAlqVEGuCQ1ygCXpEa9diMPdvbZZ9eWLVs28pCS1Lz9+/c/U1ULq9s3NMC3bNnCvn37NvKQktS8JP85qN1LKJLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1KgN/SamtBF2HtjZ+z6vX7y+931Kk/IMXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1amiAJ3ldku8k+X6SR5N8vGs/K8meJI93yzOnX64k6bhRzsB/BVxRVRcDi8DWJG8HbgH2VtVFwN5uXZK0QYYGeC17oVs9tXsVcDWwu2vfDVwzlQolSQONdA08ySlJDgBHgT1V9SBwblUdBuiW50yvTEnSaiM9Uq2qXgIWk5wB3JPkraMeIMl2YDvA5s2bxyryZLZjz2Nj/+xfvvd3eqxEffKxb+rDumahVNWzwDeArcCRJJsAuuXRNX5mV1UtVdXSwsLChOVKko4bZRbKQnfmTZLTgPcAPwTuA7Z1m20D7p1WkZKkVxrlEsomYHeSU1gO/Lur6v4k3wbuTnId8FPgQ1OsU5K0ytAAr6qHgUsGtP8cuHIaRUmShvObmJLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0a6Yk8OvlM8iQg8GlA0kbwDFySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckho1NMCTnJ/k60kOJnk0yU1d+21Jnk5yoHtdNf1yJUnHjfJV+heBm6vqoSRvAPYn2dN9tqOq/nZ65UmS1jI0wKvqMHC4e/98koPAm6ZdmCTpxNZ1DTzJFuAS4MGu6cYkDye5K8mZa/zM9iT7kuw7duzYRMVKkv7fyAGe5PXAl4CPVtVzwO3AhcAiy2fonxz0c1W1q6qWqmppYWGhh5IlSTBigCc5leXw/mxVfRmgqo5U1UtV9TJwB3Dp9MqUJK02yiyUAHcCB6vqUyvaN63Y7IPAI/2XJ0layyizUC4DPgz8IMmBru1jwLVJFoECngQ+MpUKJUkDjTIL5VtABnz0lf7LkSSNykeqzbFJH4sm6dXNr9JLUqMMcElqlAEuSY0ywCWpUQa4JDXKWShT5kwQSdPiGbgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUQa4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJapQBLkmNGhrgSc5P8vUkB5M8muSmrv2sJHuSPN4tz5x+uZKk40Y5A38RuLmqfg94O3BDkrcAtwB7q+oiYG+3LknaIEMDvKoOV9VD3fvngYPAm4Crgd3dZruBa6ZVpCTpldb1SLUkW4BLgAeBc6vqMCyHfJJz1viZ7cB2gM2bN09S68z4WDTtPLBz1iVIrzDyHzGTvB74EvDRqnpu1J+rql1VtVRVSwsLC+PUKEkaYKQAT3Iqy+H92ar6ctd8JMmm7vNNwNHplChJGmSUWSgB7gQOVtWnVnx0H7Cte78NuLf/8iRJaxnlGvhlwIeBHyQ50LV9DPgEcHeS64CfAh+aTomSpEGGBnhVfQvIGh9f2W85kqRR+U1MSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIata6bWUmakZ88MHybX/73aPu6/NbJatGrhmfgktQoA1ySGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjTLAJalRBrgkNWpogCe5K8nRJI+saLstydNJDnSvq6ZbpiRptVHOwD8DbB3QvqOqFrvXV/otS5I0zNAAr6pvAr/YgFokSeswySPVbkzyZ8A+4Oaq+uWgjZJsB7YDbN68eYLDaR7tPLBz1iVIzRr3j5i3AxcCi8Bh4JNrbVhVu6pqqaqWFhYWxjycJGm1sQK8qo5U1UtV9TJwB3Bpv2VJkoYZK8CTbFqx+kHgkbW2lSRNx9Br4Ek+D7wbODvJIeCvgXcnWQQKeBL4yBRrlCQNMDTAq+raAc13TqEWSdI6+E1MSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1KhJ7oUirWnHnsdG2u6h537+irZ3XPibfZdzUtj57MOjbXjPoJnBg11/xu+feIPLbx15X+qfZ+CS1CgDXJIaZYBLUqMMcElqlAEuSY1yFoq00k8e6G9fF7yzv31JA3gGLkmNMsAlqVEGuCQ1ygCXpEYZ4JLUKANckhplgEtSowxwSWqUAS5JjRoa4EnuSnI0ySMr2s5KsifJ493yzOmWKUlabZQz8M8AW1e13QLsraqLgL3duiRpAw0N8Kr6JvCLVc1XA7u797uBa3quS5I0xLjXwM+tqsMA3fKctTZMsj3JviT7jh07NubhJEmrTf2PmFW1q6qWqmppYWFh2oeTpJPGuAF+JMkmgG55tL+SJEmjGDfA7wO2de+3Aff2U44kaVSjTCP8PPBt4M1JDiW5DvgE8N4kjwPv7dYlSRto6BN5quraNT66sudaJEnr4CPV1L4+H4MmNcSv0ktSowxwSWqUAS5JjTLAJalRBrgkNcoAl6RGGeCS1CgDXJIaZYBLUqMMcElqlAEuSY0ywCWpUSfNzax27Hls1iVIUq88A5ekRhngktQoA1ySGmWAS1KjDHBJatRJMwtF2nA+6k1T5hm4JDXKAJekRk10CSXJk8DzwEvAi1W11EdRkqTh+rgGfnlVPdPDfiRJ6+AlFElq1KRn4AV8LUkB/1hVu1ZvkGQ7sB1g8+bNEx5Os/bQc/866xIkdSY9A7+sqt4GvB+4Icm7Vm9QVbuqaqmqlhYWFiY8nCTpuIkCvKp+1i2PAvcAl/ZRlCRpuLEDPMnpSd5w/D3wPuCRvgqTJJ3YJNfAzwXuSXJ8P5+rqq/2UpUkaaixA7yqngAu7rEWSdI6OI1Qkhrlzay0Luc9t7+X/Rx64x/0sh/pZOYZuCQ1ygCXpEYZ4JLUKANckhplgEtSo5yFornz1LP/O/bPnn/GaT1W0r6dzz584g0O7FzX/q5fvH6CarSaZ+CS1CgDXJIaZYBLUqMMcElqlAEuSY1qZhbKjj2PzboE9ehE91R56nsbWIgm85MH1rX5zhNtf8E7xyrhZJ7Z4hm4JDXKAJekRhngktQoA1ySGmWAS1KjDHBJalQz0wglzbl1Tkk87oRTE/sw5vTG1aYx3dEzcElqlAEuSY2aKMCTbE3yoyQ/TnJLX0VJkoYbO8CTnAL8A/B+4C3AtUne0ldhkqQTm+QM/FLgx1X1RFX9GvgCcHU/ZUmShplkFsqbgKdWrB8C/nD1Rkm2A9u71ReS/GiCY87S2cAzsy5iiua5f/atXa+C/n2hl73cwA2rm9bTt98a1DhJgGdAW72ioWoXsGuC47wqJNlXVUuzrmNa5rl/9q1d89y/Pvo2ySWUQ8D5K9bPA342STGSpNFNEuDfBS5KckGS3wD+BLivn7IkScOMfQmlql5MciPwb8ApwF1V9Whvlb36NH8ZaIh57p99a9c892/ivqXqFZetJUkN8JuYktQoA1ySGmWAD5DkdUm+k+T7SR5N8vGu/bYkTyc50L2umnWt40pySpLvJbm/Wz8ryZ4kj3fLM2dd47gG9G2exu3JJD/o+rGva5uLsVujb3MxdknOSPLFJD9McjDJO/oYNwN8sF8BV1TVxcAisDXJ27vPdlTVYvf6yuxKnNhNwMEV67cAe6vqImBvt96q1X2D+Rk3gMu7fhyfQzxPY7e6bzAfY/f3wFer6neBi1n+/3PicTPAB6hlL3Srp3avuflrb5LzgD8CPr2i+Wpgd/d+N3DNRtfVhzX6Nu/mYuzmVZI3Au8C7gSoql9X1bP0MG4G+Bq6X8MPAEeBPVX1YPfRjUkeTnJXq7+qAn8H/BXw8oq2c6vqMEC3PGcWhfVgUN9gPsYNlk8kvpZkf3ebCpifsRvUN2h/7H4bOAb8U3dp79NJTqeHcTPA11BVL1XVIsvfML00yVuB24ELWb6schj45AxLHEuSDwBHq2r/rGvp2wn61vy4rXBZVb2N5buA3pDkXbMuqEeD+jYPY/da4G3A7VV1CfA/9HSZywAfovtV5xvA1qo60gX7y8AdLN+RsTWXAX+c5EmW79JzRZJ/AY4k2QTQLY/OrsSxDezbnIwbAFX1s255FLiH5b7Mw9gN7NucjN0h4NCK3+K/yHKgTzxuBvgASRaSnNG9Pw14D/DD4/+xOx8EHplFfZOoqlur6ryq2sLy7Q/+var+lOXbIGzrNtsG3DujEse2Vt/mYdwAkpye5A3H3wPvY7kvzY/dWn2bh7Grqv8Cnkry5q7pSuA/6GHcfKjxYJuA3d1DK14D3F1V9yf55ySLLF+rexL4yAxr7NsngLuTXAf8FPjQjOvp09/MybidC9yTBJb/7X6uqr6a5Lu0P3Zr9W1e/s39BfDZ7r5RTwB/Tpctk4ybX6WXpEZ5CUWSGmWAS1KjDHBJapQBLkmNMsAlqVEGuCQ1ygCXpEb9H9/g0AKjfcZjAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(1)\n", "\n", "def plot_hist(df, colname, alpha):\n", " ax.hist(df[colname], alpha = alpha)\n", "\n", "penguins.groupby(\"Species\").apply(plot_hist, 'Culmen Length (mm)', 0.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use of `apply` can save you from having to write `for`-loops when iterating over the possible values of data in a column. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }