{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How many Pro Players are out there ?\n",
    "\n",
    "On [PlayGwent]() only data from the top 2860 pro players is shared. Though it is like there are (many) more. We'll try using some data science on the MMR scores from players from previous seasons to estmate how many players are actually out there. This can be done because the very first season in Masters 2 (Season of the Wolf) the total number of active players (MMR 2400 indicating at least 25 games were played) was about that threshold (the lowest MMR that season is 2407).\n",
    "\n",
    "Assuming the distribution of MMR scores remains similar accross seasons, we can leverage that to estimate the total number of players for seasons where there were more players. The idea is to determine the percentage of players with an MMR higher than 9700, 9800, 9900, 10000 and 10100 from the first season and, using those percentages, extrapolate the number of players above those thresholds to the total number of players for other seasons."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# # When running on Binder: Uncomment and execute this cell to install packages required\n",
    "# import sys\n",
    "# !conda install --yes --prefix {sys.prefix} seaborn\n",
    "# !conda install --yes --prefix {sys.prefix} nb_black"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 2;\n",
       "                var nbb_unformatted_code = \"%load_ext nb_black\\n\\nimport pandas as pd\\nimport numpy as np\\nimport seaborn as sns\\nimport matplotlib.pyplot as plt\";\n",
       "                var nbb_formatted_code = \"%load_ext nb_black\\n\\nimport pandas as pd\\nimport numpy as np\\nimport seaborn as sns\\nimport matplotlib.pyplot as plt\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "%load_ext nb_black\n",
    "\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>season</th>\n",
       "      <th>min_mmr</th>\n",
       "      <th>max_mmr</th>\n",
       "      <th>num_matches</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>M2_01 Wolf 2020</td>\n",
       "      <td>2407</td>\n",
       "      <td>10484</td>\n",
       "      <td>699496</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>M2_02 Love 2020</td>\n",
       "      <td>7776</td>\n",
       "      <td>10537</td>\n",
       "      <td>769172</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>M2_03 Bear 2020</td>\n",
       "      <td>9427</td>\n",
       "      <td>10669</td>\n",
       "      <td>862283</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>M2_04 Elf 2020</td>\n",
       "      <td>9666</td>\n",
       "      <td>10751</td>\n",
       "      <td>1004603</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>M2_05 Viper 2020</td>\n",
       "      <td>9635</td>\n",
       "      <td>10622</td>\n",
       "      <td>859640</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>M2_06 Magic 2020</td>\n",
       "      <td>9624</td>\n",
       "      <td>10597</td>\n",
       "      <td>793013</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>M2_07 Griffin 2020</td>\n",
       "      <td>9698</td>\n",
       "      <td>10667</td>\n",
       "      <td>996516</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>M2_08 Draconid 2020</td>\n",
       "      <td>9666</td>\n",
       "      <td>10546</td>\n",
       "      <td>837545</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>M2_09 Dryad 2020</td>\n",
       "      <td>9678</td>\n",
       "      <td>10725</td>\n",
       "      <td>854593</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>M2_10 Cat 2020</td>\n",
       "      <td>9703</td>\n",
       "      <td>10804</td>\n",
       "      <td>928845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>M2_11 Mahakam 2020</td>\n",
       "      <td>9706</td>\n",
       "      <td>10783</td>\n",
       "      <td>983150</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>M2_12 Wild Hunt 2020</td>\n",
       "      <td>9756</td>\n",
       "      <td>10724</td>\n",
       "      <td>1182353</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>M3_01 Wolf 2021</td>\n",
       "      <td>9637</td>\n",
       "      <td>10653</td>\n",
       "      <td>808651</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>M3_02 Love 2021</td>\n",
       "      <td>9684</td>\n",
       "      <td>10714</td>\n",
       "      <td>917027</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>M3_03 Bear 2021</td>\n",
       "      <td>9637</td>\n",
       "      <td>10576</td>\n",
       "      <td>766502</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>M3_04 Elf 2021</td>\n",
       "      <td>9686</td>\n",
       "      <td>10678</td>\n",
       "      <td>944323</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>M3_05 Viper 2021</td>\n",
       "      <td>9701</td>\n",
       "      <td>10753</td>\n",
       "      <td>956484</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>M3_06 Magic 2021</td>\n",
       "      <td>9681</td>\n",
       "      <td>10632</td>\n",
       "      <td>869262</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>M3_07 Griffin 2021</td>\n",
       "      <td>9669</td>\n",
       "      <td>10633</td>\n",
       "      <td>856103</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>M3_08 Draconid 2021</td>\n",
       "      <td>9681</td>\n",
       "      <td>10767</td>\n",
       "      <td>911273</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>M3_09 Dryad 2021</td>\n",
       "      <td>9688</td>\n",
       "      <td>10809</td>\n",
       "      <td>940655</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>M3_10 Cat 2021</td>\n",
       "      <td>9614</td>\n",
       "      <td>10366</td>\n",
       "      <td>719696</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>M3_11 Mahakam 2021</td>\n",
       "      <td>9725</td>\n",
       "      <td>10580</td>\n",
       "      <td>1017256</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>M3_12 Wild Hunt 2021</td>\n",
       "      <td>9735</td>\n",
       "      <td>10714</td>\n",
       "      <td>1044941</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>M4_01 Wolf 2022</td>\n",
       "      <td>9646</td>\n",
       "      <td>10684</td>\n",
       "      <td>883881</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  season  min_mmr  max_mmr  num_matches\n",
       "0        M2_01 Wolf 2020     2407    10484       699496\n",
       "1        M2_02 Love 2020     7776    10537       769172\n",
       "2        M2_03 Bear 2020     9427    10669       862283\n",
       "3         M2_04 Elf 2020     9666    10751      1004603\n",
       "4       M2_05 Viper 2020     9635    10622       859640\n",
       "5       M2_06 Magic 2020     9624    10597       793013\n",
       "6     M2_07 Griffin 2020     9698    10667       996516\n",
       "7    M2_08 Draconid 2020     9666    10546       837545\n",
       "8       M2_09 Dryad 2020     9678    10725       854593\n",
       "9         M2_10 Cat 2020     9703    10804       928845\n",
       "10    M2_11 Mahakam 2020     9706    10783       983150\n",
       "11  M2_12 Wild Hunt 2020     9756    10724      1182353\n",
       "12       M3_01 Wolf 2021     9637    10653       808651\n",
       "13       M3_02 Love 2021     9684    10714       917027\n",
       "14       M3_03 Bear 2021     9637    10576       766502\n",
       "15        M3_04 Elf 2021     9686    10678       944323\n",
       "16      M3_05 Viper 2021     9701    10753       956484\n",
       "17      M3_06 Magic 2021     9681    10632       869262\n",
       "18    M3_07 Griffin 2021     9669    10633       856103\n",
       "19   M3_08 Draconid 2021     9681    10767       911273\n",
       "20      M3_09 Dryad 2021     9688    10809       940655\n",
       "21        M3_10 Cat 2021     9614    10366       719696\n",
       "22    M3_11 Mahakam 2021     9725    10580      1017256\n",
       "23  M3_12 Wild Hunt 2021     9735    10714      1044941\n",
       "24       M4_01 Wolf 2022     9646    10684       883881"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 3;\n",
       "                var nbb_unformatted_code = \"df = pd.read_excel(\\\"./output/player_stats.xlsx\\\").drop(columns=[\\\"Unnamed: 0\\\"])\\nseasons_df = (\\n    df.groupby([\\\"season\\\"])\\n    .agg(\\n        min_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"min\\\"),\\n        max_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"max\\\"),\\n        num_matches=pd.NamedAgg(\\\"matches\\\", \\\"sum\\\"),\\n    )\\n    .reset_index()\\n)\\n\\nseasons_df\";\n",
       "                var nbb_formatted_code = \"df = pd.read_excel(\\\"./output/player_stats.xlsx\\\").drop(columns=[\\\"Unnamed: 0\\\"])\\nseasons_df = (\\n    df.groupby([\\\"season\\\"])\\n    .agg(\\n        min_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"min\\\"),\\n        max_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"max\\\"),\\n        num_matches=pd.NamedAgg(\\\"matches\\\", \\\"sum\\\"),\\n    )\\n    .reset_index()\\n)\\n\\nseasons_df\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "df = pd.read_excel(\"./output/player_stats.xlsx\").drop(columns=[\"Unnamed: 0\"])\n",
    "seasons_df = (\n",
    "    df.groupby([\"season\"])\n",
    "    .agg(\n",
    "        min_mmr=pd.NamedAgg(\"mmr\", \"min\"),\n",
    "        max_mmr=pd.NamedAgg(\"mmr\", \"max\"),\n",
    "        num_matches=pd.NamedAgg(\"matches\", \"sum\"),\n",
    "    )\n",
    "    .reset_index()\n",
    ")\n",
    "\n",
    "seasons_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\ProgramData\\Anaconda3\\lib\\site-packages\\seaborn\\distributions.py:2619: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).\n",
      "  warnings.warn(msg, FutureWarning)\n"
     ]
    },
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 4;\n",
       "                var nbb_unformatted_code = \"sns.distplot(df[df[\\\"season\\\"] == \\\"M2_01 Wolf 2020\\\"][\\\"mmr\\\"], bins=50)\\nplt.show()\";\n",
       "                var nbb_formatted_code = \"sns.distplot(df[df[\\\"season\\\"] == \\\"M2_01 Wolf 2020\\\"][\\\"mmr\\\"], bins=50)\\nplt.show()\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.distplot(df[df[\"season\"] == \"M2_01 Wolf 2020\"][\"mmr\"], bins=50)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>percentile</th>\n",
       "      <th>mmr</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0.0</td>\n",
       "      <td>2407.000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0.5</td>\n",
       "      <td>2413.285</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1.0</td>\n",
       "      <td>2426.000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1.5</td>\n",
       "      <td>2447.855</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2.0</td>\n",
       "      <td>2477.420</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>195</th>\n",
       "      <td>97.5</td>\n",
       "      <td>10037.575</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>196</th>\n",
       "      <td>98.0</td>\n",
       "      <td>10064.000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>197</th>\n",
       "      <td>98.5</td>\n",
       "      <td>10088.145</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>198</th>\n",
       "      <td>99.0</td>\n",
       "      <td>10135.170</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>199</th>\n",
       "      <td>99.5</td>\n",
       "      <td>10290.865</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>200 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     percentile        mmr\n",
       "0           0.0   2407.000\n",
       "1           0.5   2413.285\n",
       "2           1.0   2426.000\n",
       "3           1.5   2447.855\n",
       "4           2.0   2477.420\n",
       "..          ...        ...\n",
       "195        97.5  10037.575\n",
       "196        98.0  10064.000\n",
       "197        98.5  10088.145\n",
       "198        99.0  10135.170\n",
       "199        99.5  10290.865\n",
       "\n",
       "[200 rows x 2 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 5;\n",
       "                var nbb_unformatted_code = \"percentiles = np.percentile(\\n    df[df[\\\"season\\\"] == \\\"M2_01 Wolf 2020\\\"][\\\"mmr\\\"], [x / 2 for x in range(0, 200, 1)]\\n)\\npercentiles_df = pd.DataFrame(\\n    {\\\"percentile\\\": [x / 2 for x in range(0, 200, 1)], \\\"mmr\\\": percentiles}\\n)\\npercentiles_df.to_excel(\\\"temp.xlsx\\\")\\npercentiles_df\";\n",
       "                var nbb_formatted_code = \"percentiles = np.percentile(\\n    df[df[\\\"season\\\"] == \\\"M2_01 Wolf 2020\\\"][\\\"mmr\\\"], [x / 2 for x in range(0, 200, 1)]\\n)\\npercentiles_df = pd.DataFrame(\\n    {\\\"percentile\\\": [x / 2 for x in range(0, 200, 1)], \\\"mmr\\\": percentiles}\\n)\\npercentiles_df.to_excel(\\\"temp.xlsx\\\")\\npercentiles_df\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "percentiles = np.percentile(\n",
    "    df[df[\"season\"] == \"M2_01 Wolf 2020\"][\"mmr\"], [x / 2 for x in range(0, 200, 1)]\n",
    ")\n",
    "percentiles_df = pd.DataFrame(\n",
    "    {\"percentile\": [x / 2 for x in range(0, 200, 1)], \"mmr\": percentiles}\n",
    ")\n",
    "percentiles_df.to_excel(\"temp.xlsx\")\n",
    "percentiles_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test case\n",
    "\n",
    "To test if the method actually worked, I got into Pro Rank during the Season of the Dryad and stopped playing ranked (actually stopped playing that season entirely as I didn't have time to climb). The result was that I had an unimpressive MMR of 3360, good for rank of 12816. If our estimates work we should get a number that is in that order of magnitude. \n",
    "\n",
    "We'll check the percentage of players who are at MMR 9678 during the first season (74 percent is at or below this threshold, so 26 is above. So if the total number of players listed, 2 860 corresponds with 26 % of the total Pro players, we can quickly figure out the total number of player should be around 11 000. Still some ways off the total number I know by getting into Pro Rank and staying there. Actually, using the same trick we can use the MMR I got and the position to get an approximation of 13782 Pro Players with MMR 2400 or above that season. However, for now this is close enough."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>percentile</th>\n",
       "      <th>mmr</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>148</th>\n",
       "      <td>74.0</td>\n",
       "      <td>9678.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     percentile     mmr\n",
       "148        74.0  9678.0"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 6;\n",
       "                var nbb_unformatted_code = \"# MMR cutoff to appear on pro ladder during the Season of the Dryad = 9678\\npercentiles_df[percentiles_df.mmr >= 9678][:1]\";\n",
       "                var nbb_formatted_code = \"# MMR cutoff to appear on pro ladder during the Season of the Dryad = 9678\\npercentiles_df[percentiles_df.mmr >= 9678][:1]\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# MMR cutoff to appear on pro ladder during the Season of the Dryad = 9678\n",
    "percentiles_df[percentiles_df.mmr >= 9678][:1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "11000.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 7;\n",
       "                var nbb_unformatted_code = \"(2860 / 26) * 100\";\n",
       "                var nbb_formatted_code = \"(2860 / 26) * 100\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "(2860 / 26) * 100"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using multiple cutoffs\n",
    "\n",
    "As small changes in the number of players can cause a big shift, here we'll use multiple MMR cutoffs to make the estimations and do some statistics on them to see if we can get close to the total number of players we are aware off."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 8;\n",
       "                var nbb_unformatted_code = \"seasons = list(set(df[\\\"season\\\"]))\\n\\noutput = []\\n\\nfor i in [9700, 9800, 9900, 10000, 10100]:\\n    for season in seasons:\\n        players_above_threshold = (\\n            df[(df.season == season) & (df.mmr >= i)]\\n            .groupby([\\\"season\\\"])\\n            .agg(num_players=pd.NamedAgg(\\\"mmr\\\", \\\"count\\\"))\\n            .reset_index()\\n        ).iloc[0][\\\"num_players\\\"]\\n        percentile = int(100 - percentiles_df[percentiles_df.mmr > i][:1][\\\"percentile\\\"])\\n\\n        output.append(\\n            {\\n                \\\"season\\\": season,\\n                \\\"mmr_cutoff\\\": i,\\n                \\\"num_players\\\": players_above_threshold,\\n                \\\"total_players_est\\\": players_above_threshold * 100 / percentile,\\n            }\\n        )\\n\\nestimates_df = pd.DataFrame(output).sort_values(\\\"season\\\")\";\n",
       "                var nbb_formatted_code = \"seasons = list(set(df[\\\"season\\\"]))\\n\\noutput = []\\n\\nfor i in [9700, 9800, 9900, 10000, 10100]:\\n    for season in seasons:\\n        players_above_threshold = (\\n            df[(df.season == season) & (df.mmr >= i)]\\n            .groupby([\\\"season\\\"])\\n            .agg(num_players=pd.NamedAgg(\\\"mmr\\\", \\\"count\\\"))\\n            .reset_index()\\n        ).iloc[0][\\\"num_players\\\"]\\n        percentile = int(100 - percentiles_df[percentiles_df.mmr > i][:1][\\\"percentile\\\"])\\n\\n        output.append(\\n            {\\n                \\\"season\\\": season,\\n                \\\"mmr_cutoff\\\": i,\\n                \\\"num_players\\\": players_above_threshold,\\n                \\\"total_players_est\\\": players_above_threshold * 100 / percentile,\\n            }\\n        )\\n\\nestimates_df = pd.DataFrame(output).sort_values(\\\"season\\\")\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "seasons = list(set(df[\"season\"]))\n",
    "\n",
    "output = []\n",
    "\n",
    "for i in [9700, 9800, 9900, 10000, 10100]:\n",
    "    for season in seasons:\n",
    "        players_above_threshold = (\n",
    "            df[(df.season == season) & (df.mmr >= i)]\n",
    "            .groupby([\"season\"])\n",
    "            .agg(num_players=pd.NamedAgg(\"mmr\", \"count\"))\n",
    "            .reset_index()\n",
    "        ).iloc[0][\"num_players\"]\n",
    "        percentile = int(100 - percentiles_df[percentiles_df.mmr > i][:1][\"percentile\"])\n",
    "\n",
    "        output.append(\n",
    "            {\n",
    "                \"season\": season,\n",
    "                \"mmr_cutoff\": i,\n",
    "                \"num_players\": players_above_threshold,\n",
    "                \"total_players_est\": players_above_threshold * 100 / percentile,\n",
    "            }\n",
    "        )\n",
    "\n",
    "estimates_df = pd.DataFrame(output).sort_values(\"season\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>season</th>\n",
       "      <th>low_estimate</th>\n",
       "      <th>high_estimate</th>\n",
       "      <th>mean_estimate</th>\n",
       "      <th>std_err</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>M2_01 Wolf 2020</td>\n",
       "      <td>2900.000000</td>\n",
       "      <td>3600.0</td>\n",
       "      <td>3117.636364</td>\n",
       "      <td>124.944153</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>M2_02 Love 2020</td>\n",
       "      <td>4566.666667</td>\n",
       "      <td>7100.0</td>\n",
       "      <td>5620.242424</td>\n",
       "      <td>441.315936</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>M2_03 Bear 2020</td>\n",
       "      <td>6036.363636</td>\n",
       "      <td>10300.0</td>\n",
       "      <td>7329.272727</td>\n",
       "      <td>760.229978</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>M2_04 Elf 2020</td>\n",
       "      <td>9927.272727</td>\n",
       "      <td>18000.0</td>\n",
       "      <td>12319.454545</td>\n",
       "      <td>1494.370140</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>M2_05 Viper 2020</td>\n",
       "      <td>7766.666667</td>\n",
       "      <td>11400.0</td>\n",
       "      <td>9372.060606</td>\n",
       "      <td>727.332197</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>M2_06 Magic 2020</td>\n",
       "      <td>6800.000000</td>\n",
       "      <td>9800.0</td>\n",
       "      <td>8320.181818</td>\n",
       "      <td>618.201230</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>M2_07 Griffin 2020</td>\n",
       "      <td>12836.363636</td>\n",
       "      <td>19900.0</td>\n",
       "      <td>14683.272727</td>\n",
       "      <td>1331.618216</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>M2_08 Draconid 2020</td>\n",
       "      <td>9566.666667</td>\n",
       "      <td>13300.0</td>\n",
       "      <td>11186.242424</td>\n",
       "      <td>696.853229</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>M2_09 Dryad 2020</td>\n",
       "      <td>9733.333333</td>\n",
       "      <td>12580.0</td>\n",
       "      <td>11218.666667</td>\n",
       "      <td>458.757501</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>M2_10 Cat 2020</td>\n",
       "      <td>12800.000000</td>\n",
       "      <td>14620.0</td>\n",
       "      <td>13774.181818</td>\n",
       "      <td>369.275995</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>M2_11 Mahakam 2020</td>\n",
       "      <td>12995.454545</td>\n",
       "      <td>18900.0</td>\n",
       "      <td>16041.757576</td>\n",
       "      <td>976.287680</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>M2_12 Wild Hunt 2020</td>\n",
       "      <td>12995.454545</td>\n",
       "      <td>35900.0</td>\n",
       "      <td>23051.757576</td>\n",
       "      <td>3678.743988</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>M3_01 Wolf 2021</td>\n",
       "      <td>7968.181818</td>\n",
       "      <td>13700.0</td>\n",
       "      <td>9929.636364</td>\n",
       "      <td>1046.105394</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>M3_02 Love 2021</td>\n",
       "      <td>11645.454545</td>\n",
       "      <td>19500.0</td>\n",
       "      <td>13977.757576</td>\n",
       "      <td>1430.087791</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>M3_03 Bear 2021</td>\n",
       "      <td>7466.666667</td>\n",
       "      <td>10900.0</td>\n",
       "      <td>9195.333333</td>\n",
       "      <td>715.137594</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>M3_04 Elf 2021</td>\n",
       "      <td>11586.363636</td>\n",
       "      <td>20300.0</td>\n",
       "      <td>14814.606061</td>\n",
       "      <td>1527.809485</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>M3_05 Viper 2021</td>\n",
       "      <td>12990.909091</td>\n",
       "      <td>20900.0</td>\n",
       "      <td>16159.515152</td>\n",
       "      <td>1322.547676</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>M3_06 Magic 2021</td>\n",
       "      <td>11263.636364</td>\n",
       "      <td>17200.0</td>\n",
       "      <td>13526.060606</td>\n",
       "      <td>1034.558786</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>M3_07 Griffin 2021</td>\n",
       "      <td>10140.909091</td>\n",
       "      <td>15800.0</td>\n",
       "      <td>12241.515152</td>\n",
       "      <td>1040.267969</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>M3_08 Draconid 2021</td>\n",
       "      <td>11200.000000</td>\n",
       "      <td>16900.0</td>\n",
       "      <td>12854.545455</td>\n",
       "      <td>1051.176279</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>M3_09 Dryad 2021</td>\n",
       "      <td>11868.181818</td>\n",
       "      <td>15900.0</td>\n",
       "      <td>13442.969697</td>\n",
       "      <td>746.264455</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>M3_10 Cat 2021</td>\n",
       "      <td>5100.000000</td>\n",
       "      <td>7770.0</td>\n",
       "      <td>6559.939394</td>\n",
       "      <td>575.186402</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>M3_11 Mahakam 2021</td>\n",
       "      <td>12981.818182</td>\n",
       "      <td>23300.0</td>\n",
       "      <td>17920.363636</td>\n",
       "      <td>1647.708090</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>M3_12 Wild Hunt 2021</td>\n",
       "      <td>12981.818182</td>\n",
       "      <td>27500.0</td>\n",
       "      <td>19567.696970</td>\n",
       "      <td>2331.781053</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>M4_01 Wolf 2022</td>\n",
       "      <td>8690.909091</td>\n",
       "      <td>18900.0</td>\n",
       "      <td>12063.515152</td>\n",
       "      <td>1789.893240</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  season  low_estimate  high_estimate  mean_estimate  \\\n",
       "0        M2_01 Wolf 2020   2900.000000         3600.0    3117.636364   \n",
       "1        M2_02 Love 2020   4566.666667         7100.0    5620.242424   \n",
       "2        M2_03 Bear 2020   6036.363636        10300.0    7329.272727   \n",
       "3         M2_04 Elf 2020   9927.272727        18000.0   12319.454545   \n",
       "4       M2_05 Viper 2020   7766.666667        11400.0    9372.060606   \n",
       "5       M2_06 Magic 2020   6800.000000         9800.0    8320.181818   \n",
       "6     M2_07 Griffin 2020  12836.363636        19900.0   14683.272727   \n",
       "7    M2_08 Draconid 2020   9566.666667        13300.0   11186.242424   \n",
       "8       M2_09 Dryad 2020   9733.333333        12580.0   11218.666667   \n",
       "9         M2_10 Cat 2020  12800.000000        14620.0   13774.181818   \n",
       "10    M2_11 Mahakam 2020  12995.454545        18900.0   16041.757576   \n",
       "11  M2_12 Wild Hunt 2020  12995.454545        35900.0   23051.757576   \n",
       "12       M3_01 Wolf 2021   7968.181818        13700.0    9929.636364   \n",
       "13       M3_02 Love 2021  11645.454545        19500.0   13977.757576   \n",
       "14       M3_03 Bear 2021   7466.666667        10900.0    9195.333333   \n",
       "15        M3_04 Elf 2021  11586.363636        20300.0   14814.606061   \n",
       "16      M3_05 Viper 2021  12990.909091        20900.0   16159.515152   \n",
       "17      M3_06 Magic 2021  11263.636364        17200.0   13526.060606   \n",
       "18    M3_07 Griffin 2021  10140.909091        15800.0   12241.515152   \n",
       "19   M3_08 Draconid 2021  11200.000000        16900.0   12854.545455   \n",
       "20      M3_09 Dryad 2021  11868.181818        15900.0   13442.969697   \n",
       "21        M3_10 Cat 2021   5100.000000         7770.0    6559.939394   \n",
       "22    M3_11 Mahakam 2021  12981.818182        23300.0   17920.363636   \n",
       "23  M3_12 Wild Hunt 2021  12981.818182        27500.0   19567.696970   \n",
       "24       M4_01 Wolf 2022   8690.909091        18900.0   12063.515152   \n",
       "\n",
       "        std_err  \n",
       "0    124.944153  \n",
       "1    441.315936  \n",
       "2    760.229978  \n",
       "3   1494.370140  \n",
       "4    727.332197  \n",
       "5    618.201230  \n",
       "6   1331.618216  \n",
       "7    696.853229  \n",
       "8    458.757501  \n",
       "9    369.275995  \n",
       "10   976.287680  \n",
       "11  3678.743988  \n",
       "12  1046.105394  \n",
       "13  1430.087791  \n",
       "14   715.137594  \n",
       "15  1527.809485  \n",
       "16  1322.547676  \n",
       "17  1034.558786  \n",
       "18  1040.267969  \n",
       "19  1051.176279  \n",
       "20   746.264455  \n",
       "21   575.186402  \n",
       "22  1647.708090  \n",
       "23  2331.781053  \n",
       "24  1789.893240  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 9;\n",
       "                var nbb_unformatted_code = \"estimate_summary = (\\n    estimates_df.groupby([\\\"season\\\"])\\n    .agg(\\n        low_estimate=pd.NamedAgg(\\\"total_players_est\\\", \\\"min\\\"),\\n        high_estimate=pd.NamedAgg(\\\"total_players_est\\\", \\\"max\\\"),\\n        mean_estimate=pd.NamedAgg(\\\"total_players_est\\\", \\\"mean\\\"),\\n        std_err=pd.NamedAgg(\\\"total_players_est\\\", \\\"sem\\\"),\\n    )\\n    .reset_index()\\n)\\nestimate_summary.to_excel(\\\"./output/player_estimates.xlsx\\\")\\nestimate_summary\";\n",
       "                var nbb_formatted_code = \"estimate_summary = (\\n    estimates_df.groupby([\\\"season\\\"])\\n    .agg(\\n        low_estimate=pd.NamedAgg(\\\"total_players_est\\\", \\\"min\\\"),\\n        high_estimate=pd.NamedAgg(\\\"total_players_est\\\", \\\"max\\\"),\\n        mean_estimate=pd.NamedAgg(\\\"total_players_est\\\", \\\"mean\\\"),\\n        std_err=pd.NamedAgg(\\\"total_players_est\\\", \\\"sem\\\"),\\n    )\\n    .reset_index()\\n)\\nestimate_summary.to_excel(\\\"./output/player_estimates.xlsx\\\")\\nestimate_summary\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "estimate_summary = (\n",
    "    estimates_df.groupby([\"season\"])\n",
    "    .agg(\n",
    "        low_estimate=pd.NamedAgg(\"total_players_est\", \"min\"),\n",
    "        high_estimate=pd.NamedAgg(\"total_players_est\", \"max\"),\n",
    "        mean_estimate=pd.NamedAgg(\"total_players_est\", \"mean\"),\n",
    "        std_err=pd.NamedAgg(\"total_players_est\", \"sem\"),\n",
    "    )\n",
    "    .reset_index()\n",
    ")\n",
    "estimate_summary.to_excel(\"./output/player_estimates.xlsx\")\n",
    "estimate_summary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Results ... maybe ...\n",
    "\n",
    "So here we can see the minimum and maximum estimated numbers of Pro Players can differ a lot, and for the season of the Dryad even the max estimate is still at least 1000 players shy of the ground truth. Though, given the scarcity of data to work with and some of the assumptions we need to make (it is unlikely that the discribution remains exactly the same as buffs/nerfs to cards and new cards could affect how easy it is to reach a certain fMMR with specific factions) it is the best that can be done with the data at hand and it is probably close enough.\n",
    "\n",
    "One thing I wanted to check if if there is a correlation between the number of games played by the top 2860 players and the estimated total number of players. This can easily be done by mergin our estimations with the number of players in the season summaries and plotting them using a scatter plot (or regplot to have a regression line in the plot). We'll do this for the top 500 players seperately as well to see if the trend holds for that section of the players."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>season</th>\n",
       "      <th>min_mmr</th>\n",
       "      <th>max_mmr</th>\n",
       "      <th>num_matches</th>\n",
       "      <th>num_matches_top500</th>\n",
       "      <th>low_estimate</th>\n",
       "      <th>high_estimate</th>\n",
       "      <th>mean_estimate</th>\n",
       "      <th>std_err</th>\n",
       "      <th>Masters</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>M2_01 Wolf 2020</td>\n",
       "      <td>2407</td>\n",
       "      <td>10484</td>\n",
       "      <td>699496</td>\n",
       "      <td>178323</td>\n",
       "      <td>2900.000000</td>\n",
       "      <td>3600.0</td>\n",
       "      <td>3117.636364</td>\n",
       "      <td>124.944153</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>M2_02 Love 2020</td>\n",
       "      <td>7776</td>\n",
       "      <td>10537</td>\n",
       "      <td>769172</td>\n",
       "      <td>183972</td>\n",
       "      <td>4566.666667</td>\n",
       "      <td>7100.0</td>\n",
       "      <td>5620.242424</td>\n",
       "      <td>441.315936</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>M2_03 Bear 2020</td>\n",
       "      <td>9427</td>\n",
       "      <td>10669</td>\n",
       "      <td>862283</td>\n",
       "      <td>205439</td>\n",
       "      <td>6036.363636</td>\n",
       "      <td>10300.0</td>\n",
       "      <td>7329.272727</td>\n",
       "      <td>760.229978</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>M2_04 Elf 2020</td>\n",
       "      <td>9666</td>\n",
       "      <td>10751</td>\n",
       "      <td>1004603</td>\n",
       "      <td>251712</td>\n",
       "      <td>9927.272727</td>\n",
       "      <td>18000.0</td>\n",
       "      <td>12319.454545</td>\n",
       "      <td>1494.370140</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>M2_05 Viper 2020</td>\n",
       "      <td>9635</td>\n",
       "      <td>10622</td>\n",
       "      <td>859640</td>\n",
       "      <td>207622</td>\n",
       "      <td>7766.666667</td>\n",
       "      <td>11400.0</td>\n",
       "      <td>9372.060606</td>\n",
       "      <td>727.332197</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>M2_06 Magic 2020</td>\n",
       "      <td>9624</td>\n",
       "      <td>10597</td>\n",
       "      <td>793013</td>\n",
       "      <td>188536</td>\n",
       "      <td>6800.000000</td>\n",
       "      <td>9800.0</td>\n",
       "      <td>8320.181818</td>\n",
       "      <td>618.201230</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>M2_07 Griffin 2020</td>\n",
       "      <td>9698</td>\n",
       "      <td>10667</td>\n",
       "      <td>996516</td>\n",
       "      <td>259713</td>\n",
       "      <td>12836.363636</td>\n",
       "      <td>19900.0</td>\n",
       "      <td>14683.272727</td>\n",
       "      <td>1331.618216</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>M2_08 Draconid 2020</td>\n",
       "      <td>9666</td>\n",
       "      <td>10546</td>\n",
       "      <td>837545</td>\n",
       "      <td>209785</td>\n",
       "      <td>9566.666667</td>\n",
       "      <td>13300.0</td>\n",
       "      <td>11186.242424</td>\n",
       "      <td>696.853229</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>M2_09 Dryad 2020</td>\n",
       "      <td>9678</td>\n",
       "      <td>10725</td>\n",
       "      <td>854593</td>\n",
       "      <td>202099</td>\n",
       "      <td>9733.333333</td>\n",
       "      <td>12580.0</td>\n",
       "      <td>11218.666667</td>\n",
       "      <td>458.757501</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>M2_10 Cat 2020</td>\n",
       "      <td>9703</td>\n",
       "      <td>10804</td>\n",
       "      <td>928845</td>\n",
       "      <td>213867</td>\n",
       "      <td>12800.000000</td>\n",
       "      <td>14620.0</td>\n",
       "      <td>13774.181818</td>\n",
       "      <td>369.275995</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>M2_11 Mahakam 2020</td>\n",
       "      <td>9706</td>\n",
       "      <td>10783</td>\n",
       "      <td>983150</td>\n",
       "      <td>230710</td>\n",
       "      <td>12995.454545</td>\n",
       "      <td>18900.0</td>\n",
       "      <td>16041.757576</td>\n",
       "      <td>976.287680</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>M2_12 Wild Hunt 2020</td>\n",
       "      <td>9756</td>\n",
       "      <td>10724</td>\n",
       "      <td>1182353</td>\n",
       "      <td>290718</td>\n",
       "      <td>12995.454545</td>\n",
       "      <td>35900.0</td>\n",
       "      <td>23051.757576</td>\n",
       "      <td>3678.743988</td>\n",
       "      <td>M2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>M3_01 Wolf 2021</td>\n",
       "      <td>9637</td>\n",
       "      <td>10653</td>\n",
       "      <td>808651</td>\n",
       "      <td>224998</td>\n",
       "      <td>7968.181818</td>\n",
       "      <td>13700.0</td>\n",
       "      <td>9929.636364</td>\n",
       "      <td>1046.105394</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>M3_02 Love 2021</td>\n",
       "      <td>9684</td>\n",
       "      <td>10714</td>\n",
       "      <td>917027</td>\n",
       "      <td>243266</td>\n",
       "      <td>11645.454545</td>\n",
       "      <td>19500.0</td>\n",
       "      <td>13977.757576</td>\n",
       "      <td>1430.087791</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>M3_03 Bear 2021</td>\n",
       "      <td>9637</td>\n",
       "      <td>10576</td>\n",
       "      <td>766502</td>\n",
       "      <td>189128</td>\n",
       "      <td>7466.666667</td>\n",
       "      <td>10900.0</td>\n",
       "      <td>9195.333333</td>\n",
       "      <td>715.137594</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15</th>\n",
       "      <td>M3_04 Elf 2021</td>\n",
       "      <td>9686</td>\n",
       "      <td>10678</td>\n",
       "      <td>944323</td>\n",
       "      <td>242792</td>\n",
       "      <td>11586.363636</td>\n",
       "      <td>20300.0</td>\n",
       "      <td>14814.606061</td>\n",
       "      <td>1527.809485</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16</th>\n",
       "      <td>M3_05 Viper 2021</td>\n",
       "      <td>9701</td>\n",
       "      <td>10753</td>\n",
       "      <td>956484</td>\n",
       "      <td>240472</td>\n",
       "      <td>12990.909091</td>\n",
       "      <td>20900.0</td>\n",
       "      <td>16159.515152</td>\n",
       "      <td>1322.547676</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>M3_06 Magic 2021</td>\n",
       "      <td>9681</td>\n",
       "      <td>10632</td>\n",
       "      <td>869262</td>\n",
       "      <td>215751</td>\n",
       "      <td>11263.636364</td>\n",
       "      <td>17200.0</td>\n",
       "      <td>13526.060606</td>\n",
       "      <td>1034.558786</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>M3_07 Griffin 2021</td>\n",
       "      <td>9669</td>\n",
       "      <td>10633</td>\n",
       "      <td>856103</td>\n",
       "      <td>212764</td>\n",
       "      <td>10140.909091</td>\n",
       "      <td>15800.0</td>\n",
       "      <td>12241.515152</td>\n",
       "      <td>1040.267969</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>M3_08 Draconid 2021</td>\n",
       "      <td>9681</td>\n",
       "      <td>10767</td>\n",
       "      <td>911273</td>\n",
       "      <td>232360</td>\n",
       "      <td>11200.000000</td>\n",
       "      <td>16900.0</td>\n",
       "      <td>12854.545455</td>\n",
       "      <td>1051.176279</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>M3_09 Dryad 2021</td>\n",
       "      <td>9688</td>\n",
       "      <td>10809</td>\n",
       "      <td>940655</td>\n",
       "      <td>223832</td>\n",
       "      <td>11868.181818</td>\n",
       "      <td>15900.0</td>\n",
       "      <td>13442.969697</td>\n",
       "      <td>746.264455</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>M3_10 Cat 2021</td>\n",
       "      <td>9614</td>\n",
       "      <td>10366</td>\n",
       "      <td>719696</td>\n",
       "      <td>149104</td>\n",
       "      <td>5100.000000</td>\n",
       "      <td>7770.0</td>\n",
       "      <td>6559.939394</td>\n",
       "      <td>575.186402</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>22</th>\n",
       "      <td>M3_11 Mahakam 2021</td>\n",
       "      <td>9725</td>\n",
       "      <td>10580</td>\n",
       "      <td>1017256</td>\n",
       "      <td>247078</td>\n",
       "      <td>12981.818182</td>\n",
       "      <td>23300.0</td>\n",
       "      <td>17920.363636</td>\n",
       "      <td>1647.708090</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>M3_12 Wild Hunt 2021</td>\n",
       "      <td>9735</td>\n",
       "      <td>10714</td>\n",
       "      <td>1044941</td>\n",
       "      <td>258469</td>\n",
       "      <td>12981.818182</td>\n",
       "      <td>27500.0</td>\n",
       "      <td>19567.696970</td>\n",
       "      <td>2331.781053</td>\n",
       "      <td>M3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>24</th>\n",
       "      <td>M4_01 Wolf 2022</td>\n",
       "      <td>9646</td>\n",
       "      <td>10684</td>\n",
       "      <td>883881</td>\n",
       "      <td>226411</td>\n",
       "      <td>8690.909091</td>\n",
       "      <td>18900.0</td>\n",
       "      <td>12063.515152</td>\n",
       "      <td>1789.893240</td>\n",
       "      <td>M4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  season  min_mmr  max_mmr  num_matches  num_matches_top500  \\\n",
       "0        M2_01 Wolf 2020     2407    10484       699496              178323   \n",
       "1        M2_02 Love 2020     7776    10537       769172              183972   \n",
       "2        M2_03 Bear 2020     9427    10669       862283              205439   \n",
       "3         M2_04 Elf 2020     9666    10751      1004603              251712   \n",
       "4       M2_05 Viper 2020     9635    10622       859640              207622   \n",
       "5       M2_06 Magic 2020     9624    10597       793013              188536   \n",
       "6     M2_07 Griffin 2020     9698    10667       996516              259713   \n",
       "7    M2_08 Draconid 2020     9666    10546       837545              209785   \n",
       "8       M2_09 Dryad 2020     9678    10725       854593              202099   \n",
       "9         M2_10 Cat 2020     9703    10804       928845              213867   \n",
       "10    M2_11 Mahakam 2020     9706    10783       983150              230710   \n",
       "11  M2_12 Wild Hunt 2020     9756    10724      1182353              290718   \n",
       "12       M3_01 Wolf 2021     9637    10653       808651              224998   \n",
       "13       M3_02 Love 2021     9684    10714       917027              243266   \n",
       "14       M3_03 Bear 2021     9637    10576       766502              189128   \n",
       "15        M3_04 Elf 2021     9686    10678       944323              242792   \n",
       "16      M3_05 Viper 2021     9701    10753       956484              240472   \n",
       "17      M3_06 Magic 2021     9681    10632       869262              215751   \n",
       "18    M3_07 Griffin 2021     9669    10633       856103              212764   \n",
       "19   M3_08 Draconid 2021     9681    10767       911273              232360   \n",
       "20      M3_09 Dryad 2021     9688    10809       940655              223832   \n",
       "21        M3_10 Cat 2021     9614    10366       719696              149104   \n",
       "22    M3_11 Mahakam 2021     9725    10580      1017256              247078   \n",
       "23  M3_12 Wild Hunt 2021     9735    10714      1044941              258469   \n",
       "24       M4_01 Wolf 2022     9646    10684       883881              226411   \n",
       "\n",
       "    low_estimate  high_estimate  mean_estimate      std_err Masters  \n",
       "0    2900.000000         3600.0    3117.636364   124.944153      M2  \n",
       "1    4566.666667         7100.0    5620.242424   441.315936      M2  \n",
       "2    6036.363636        10300.0    7329.272727   760.229978      M2  \n",
       "3    9927.272727        18000.0   12319.454545  1494.370140      M2  \n",
       "4    7766.666667        11400.0    9372.060606   727.332197      M2  \n",
       "5    6800.000000         9800.0    8320.181818   618.201230      M2  \n",
       "6   12836.363636        19900.0   14683.272727  1331.618216      M2  \n",
       "7    9566.666667        13300.0   11186.242424   696.853229      M2  \n",
       "8    9733.333333        12580.0   11218.666667   458.757501      M2  \n",
       "9   12800.000000        14620.0   13774.181818   369.275995      M2  \n",
       "10  12995.454545        18900.0   16041.757576   976.287680      M2  \n",
       "11  12995.454545        35900.0   23051.757576  3678.743988      M2  \n",
       "12   7968.181818        13700.0    9929.636364  1046.105394      M3  \n",
       "13  11645.454545        19500.0   13977.757576  1430.087791      M3  \n",
       "14   7466.666667        10900.0    9195.333333   715.137594      M3  \n",
       "15  11586.363636        20300.0   14814.606061  1527.809485      M3  \n",
       "16  12990.909091        20900.0   16159.515152  1322.547676      M3  \n",
       "17  11263.636364        17200.0   13526.060606  1034.558786      M3  \n",
       "18  10140.909091        15800.0   12241.515152  1040.267969      M3  \n",
       "19  11200.000000        16900.0   12854.545455  1051.176279      M3  \n",
       "20  11868.181818        15900.0   13442.969697   746.264455      M3  \n",
       "21   5100.000000         7770.0    6559.939394   575.186402      M3  \n",
       "22  12981.818182        23300.0   17920.363636  1647.708090      M3  \n",
       "23  12981.818182        27500.0   19567.696970  2331.781053      M3  \n",
       "24   8690.909091        18900.0   12063.515152  1789.893240      M4  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 13;\n",
       "                var nbb_unformatted_code = \"# same thing but only considering the top 500 players\\nseasons_top500only_df = (\\n    df[pd.to_numeric(df[\\\"rank\\\"]) <= 500]\\n    .groupby([\\\"season\\\"])\\n    .agg(\\n        min_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"min\\\"),\\n        max_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"max\\\"),\\n        num_matches=pd.NamedAgg(\\\"matches\\\", \\\"sum\\\"),\\n    )\\n    .reset_index()\\n)\\n\\nmerged_df = pd.merge(\\n    seasons_df,\\n    seasons_top500only_df,\\n    how=\\\"inner\\\",\\n    on=\\\"season\\\",\\n    suffixes=(\\\"\\\", \\\"_top500\\\"),\\n)\\nmerged_df = pd.merge(merged_df, estimate_summary, how=\\\"inner\\\", on=\\\"season\\\").drop(\\n    columns=[\\\"min_mmr_top500\\\", \\\"max_mmr_top500\\\"]\\n)\\n\\nmerged_df[\\\"Masters\\\"] = merged_df[\\\"season\\\"].apply(lambda x: x.split('_')[0])\\n\\nmerged_df\";\n",
       "                var nbb_formatted_code = \"# same thing but only considering the top 500 players\\nseasons_top500only_df = (\\n    df[pd.to_numeric(df[\\\"rank\\\"]) <= 500]\\n    .groupby([\\\"season\\\"])\\n    .agg(\\n        min_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"min\\\"),\\n        max_mmr=pd.NamedAgg(\\\"mmr\\\", \\\"max\\\"),\\n        num_matches=pd.NamedAgg(\\\"matches\\\", \\\"sum\\\"),\\n    )\\n    .reset_index()\\n)\\n\\nmerged_df = pd.merge(\\n    seasons_df,\\n    seasons_top500only_df,\\n    how=\\\"inner\\\",\\n    on=\\\"season\\\",\\n    suffixes=(\\\"\\\", \\\"_top500\\\"),\\n)\\nmerged_df = pd.merge(merged_df, estimate_summary, how=\\\"inner\\\", on=\\\"season\\\").drop(\\n    columns=[\\\"min_mmr_top500\\\", \\\"max_mmr_top500\\\"]\\n)\\n\\nmerged_df[\\\"Masters\\\"] = merged_df[\\\"season\\\"].apply(lambda x: x.split(\\\"_\\\")[0])\\n\\nmerged_df\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# same thing but only considering the top 500 players\n",
    "seasons_top500only_df = (\n",
    "    df[pd.to_numeric(df[\"rank\"]) <= 500]\n",
    "    .groupby([\"season\"])\n",
    "    .agg(\n",
    "        min_mmr=pd.NamedAgg(\"mmr\", \"min\"),\n",
    "        max_mmr=pd.NamedAgg(\"mmr\", \"max\"),\n",
    "        num_matches=pd.NamedAgg(\"matches\", \"sum\"),\n",
    "    )\n",
    "    .reset_index()\n",
    ")\n",
    "\n",
    "merged_df = pd.merge(\n",
    "    seasons_df,\n",
    "    seasons_top500only_df,\n",
    "    how=\"inner\",\n",
    "    on=\"season\",\n",
    "    suffixes=(\"\", \"_top500\"),\n",
    ")\n",
    "merged_df = pd.merge(merged_df, estimate_summary, how=\"inner\", on=\"season\").drop(\n",
    "    columns=[\"min_mmr_top500\", \"max_mmr_top500\"]\n",
    ")\n",
    "\n",
    "merged_df[\"Masters\"] = merged_df[\"season\"].apply(lambda x: x.split(\"_\")[0])\n",
    "\n",
    "merged_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 15;\n",
       "                var nbb_unformatted_code = \"sns.regplot(data=merged_df, x=\\\"num_matches\\\", y=\\\"high_estimate\\\",    scatter=False,\\n    color=\\\".25\\\",)\\nsns.scatterplot(\\n    data=merged_df, x=\\\"num_matches\\\", y=\\\"high_estimate\\\", hue=\\\"Masters\\\"\\n)\\nplt.show()\";\n",
       "                var nbb_formatted_code = \"sns.regplot(\\n    data=merged_df, x=\\\"num_matches\\\", y=\\\"high_estimate\\\", scatter=False, color=\\\".25\\\",\\n)\\nsns.scatterplot(data=merged_df, x=\\\"num_matches\\\", y=\\\"high_estimate\\\", hue=\\\"Masters\\\")\\nplt.show()\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.regplot(\n",
    "    data=merged_df, x=\"num_matches\", y=\"high_estimate\", scatter=False, color=\".25\",\n",
    ")\n",
    "sns.scatterplot(data=merged_df, x=\"num_matches\", y=\"high_estimate\", hue=\"Masters\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    },
    {
     "data": {
      "application/javascript": [
       "\n",
       "            setTimeout(function() {\n",
       "                var nbb_cell_id = 14;\n",
       "                var nbb_unformatted_code = \"sns.regplot(data=merged_df, x=\\\"num_matches_top500\\\", y=\\\"high_estimate\\\", scatter=False, color=\\\".25\\\")\\nsns.scatterplot(data=merged_df, x=\\\"num_matches_top500\\\", y=\\\"high_estimate\\\", hue=\\\"Masters\\\")\\nplt.show()\";\n",
       "                var nbb_formatted_code = \"sns.regplot(\\n    data=merged_df,\\n    x=\\\"num_matches_top500\\\",\\n    y=\\\"high_estimate\\\",\\n    scatter=False,\\n    color=\\\".25\\\",\\n)\\nsns.scatterplot(\\n    data=merged_df, x=\\\"num_matches_top500\\\", y=\\\"high_estimate\\\", hue=\\\"Masters\\\"\\n)\\nplt.show()\";\n",
       "                var nbb_cells = Jupyter.notebook.get_cells();\n",
       "                for (var i = 0; i < nbb_cells.length; ++i) {\n",
       "                    if (nbb_cells[i].input_prompt_number == nbb_cell_id) {\n",
       "                        if (nbb_cells[i].get_text() == nbb_unformatted_code) {\n",
       "                             nbb_cells[i].set_text(nbb_formatted_code);\n",
       "                        }\n",
       "                        break;\n",
       "                    }\n",
       "                }\n",
       "            }, 500);\n",
       "            "
      ],
      "text/plain": [
       "<IPython.core.display.Javascript object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.regplot(\n",
    "    data=merged_df,\n",
    "    x=\"num_matches_top500\",\n",
    "    y=\"high_estimate\",\n",
    "    scatter=False,\n",
    "    color=\".25\",\n",
    ")\n",
    "sns.scatterplot(\n",
    "    data=merged_df, x=\"num_matches_top500\", y=\"high_estimate\", hue=\"Masters\"\n",
    ")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is a clear and mostly linear correlation between the number of Pro Ranked players and the number of games played by both the top 2860 and top 500 players. So the more players there are, the more matches someone has to play to get in the higher ranks. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}