{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example notebook for the %%stata cell magic by the IPyStata package. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Author:**   Ties de Kok <tdekok@uw.edu>  \n",
    "**Homepage:**    https://github.com/TiesdeKok/ipystata  \n",
    "**PyPi:** https://pypi.python.org/pypi/ipystata  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Note: this example notebook uses the `Stata Batch Mode` method."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "See Github for an example notebook using the Windows-only `Stata automation` method."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import ipystata"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configure ipystata"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/javascript": [
       "require(['notebook/js/codecell'], function(codecell) {\n",
       "\t\t\t  codecell.CodeCell.options_default.highlight_modes['magic_stata'] = {'reg':[/^%%stata/]} ;\n",
       "\t\t\t  Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n",
       "\t\t\t      Jupyter.notebook.get_cells().map(function(cell){\n",
       "\t\t\t          if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n",
       "\t\t\t  });\n",
       "\t\t\t});"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from ipystata.config import config_stata\n",
    "config_stata('/home/user/stata15/stata-se')\n",
    "#config_stata(\"D:\\Software\\stata15\\StataSE-64.exe\", force_batch=True) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note:** for this change to take effect you need to `Kernel` --> `Restart` the notebook."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Check whether IPyStata is working"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Hello, I am printed by Stata.\n"
     ]
    }
   ],
   "source": [
    "%%stata\n",
    "\n",
    "display \"Hello, I am printed by Stata.\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Some examples based on the Stata 13 manual"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load the dataset \"auto.dta\" in Stata return it back to Python as a Pandas dataframe"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The code cell below runs the Stata command **`sysuse auto.dta`** to load the dataset and returns it back to Python via the **`-o car_df`** argument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "(1978 Automobile Data)\n"
     ]
    }
   ],
   "source": [
    "%%stata -o car_df\n",
    "sysuse auto.dta"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**`car_df`** is a regular Pandas dataframe on which Python / Pandas actions can be performed. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>make</th>\n",
       "      <th>price</th>\n",
       "      <th>mpg</th>\n",
       "      <th>rep78</th>\n",
       "      <th>headroom</th>\n",
       "      <th>trunk</th>\n",
       "      <th>weight</th>\n",
       "      <th>length</th>\n",
       "      <th>turn</th>\n",
       "      <th>displacement</th>\n",
       "      <th>gear_ratio</th>\n",
       "      <th>foreign</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AMC Concord</td>\n",
       "      <td>4099</td>\n",
       "      <td>22</td>\n",
       "      <td>3.0</td>\n",
       "      <td>2.5</td>\n",
       "      <td>11</td>\n",
       "      <td>2930</td>\n",
       "      <td>186</td>\n",
       "      <td>40</td>\n",
       "      <td>121</td>\n",
       "      <td>3.58</td>\n",
       "      <td>Domestic</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>AMC Pacer</td>\n",
       "      <td>4749</td>\n",
       "      <td>17</td>\n",
       "      <td>3.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>11</td>\n",
       "      <td>3350</td>\n",
       "      <td>173</td>\n",
       "      <td>40</td>\n",
       "      <td>258</td>\n",
       "      <td>2.53</td>\n",
       "      <td>Domestic</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>AMC Spirit</td>\n",
       "      <td>3799</td>\n",
       "      <td>22</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3.0</td>\n",
       "      <td>12</td>\n",
       "      <td>2640</td>\n",
       "      <td>168</td>\n",
       "      <td>35</td>\n",
       "      <td>121</td>\n",
       "      <td>3.08</td>\n",
       "      <td>Domestic</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Buick Century</td>\n",
       "      <td>4816</td>\n",
       "      <td>20</td>\n",
       "      <td>3.0</td>\n",
       "      <td>4.5</td>\n",
       "      <td>16</td>\n",
       "      <td>3250</td>\n",
       "      <td>196</td>\n",
       "      <td>40</td>\n",
       "      <td>196</td>\n",
       "      <td>2.93</td>\n",
       "      <td>Domestic</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Buick Electra</td>\n",
       "      <td>7827</td>\n",
       "      <td>15</td>\n",
       "      <td>4.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>20</td>\n",
       "      <td>4080</td>\n",
       "      <td>222</td>\n",
       "      <td>43</td>\n",
       "      <td>350</td>\n",
       "      <td>2.41</td>\n",
       "      <td>Domestic</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            make  price  mpg  rep78  headroom  trunk  weight  length  turn  \\\n",
       "0    AMC Concord   4099   22    3.0       2.5     11    2930     186    40   \n",
       "1      AMC Pacer   4749   17    3.0       3.0     11    3350     173    40   \n",
       "2     AMC Spirit   3799   22    NaN       3.0     12    2640     168    35   \n",
       "3  Buick Century   4816   20    3.0       4.5     16    3250     196    40   \n",
       "4  Buick Electra   7827   15    4.0       4.0     20    4080     222    43   \n",
       "\n",
       "   displacement  gear_ratio   foreign  \n",
       "0           121        3.58  Domestic  \n",
       "1           258        2.53  Domestic  \n",
       "2           121        3.08  Domestic  \n",
       "3           196        2.93  Domestic  \n",
       "4           350        2.41  Domestic  "
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "car_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic descriptive statistics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The argument **`-d or --data`** is used to define which dataframe should be set as dataset in Stata.  \n",
    "In the example below the Stata function **`tabulate`** is used to generate some descriptive statistics for the dataframe **`car_df`**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "           |                                        headroom\n",
      "   foreign |       1.5          2        2.5          3        3.5          4        4.5          5 |     Total\n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "  Domestic |         3         10          4          7         13         10          4          1 |        52 \n",
      "   Foreign |         1          3         10          6          2          0          0          0 |        22 \n",
      "-----------+----------------------------------------------------------------------------------------+----------\n",
      "     Total |         4         13         14         13         15         10          4          1 |        74 \n"
     ]
    }
   ],
   "source": [
    "%%stata -d car_df\n",
    "tabulate foreign headroom"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These descriptive statistics can be replicated in Pandas using the **`crosstab`** fuction, see the code below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th>headroom</th>\n",
       "      <th>1.5</th>\n",
       "      <th>2.0</th>\n",
       "      <th>2.5</th>\n",
       "      <th>3.0</th>\n",
       "      <th>3.5</th>\n",
       "      <th>4.0</th>\n",
       "      <th>4.5</th>\n",
       "      <th>5.0</th>\n",
       "      <th>All</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>foreign</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Domestic</th>\n",
       "      <td>3</td>\n",
       "      <td>10</td>\n",
       "      <td>4</td>\n",
       "      <td>7</td>\n",
       "      <td>13</td>\n",
       "      <td>10</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Foreign</th>\n",
       "      <td>1</td>\n",
       "      <td>3</td>\n",
       "      <td>10</td>\n",
       "      <td>6</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>22</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>All</th>\n",
       "      <td>4</td>\n",
       "      <td>13</td>\n",
       "      <td>14</td>\n",
       "      <td>13</td>\n",
       "      <td>15</td>\n",
       "      <td>10</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>74</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "headroom  1.5  2.0  2.5  3.0  3.5  4.0  4.5  5.0  All\n",
       "foreign                                              \n",
       "Domestic    3   10    4    7   13   10    4    1   52\n",
       "Foreign     1    3   10    6    2    0    0    0   22\n",
       "All         4   13   14   13   15   10    4    1   74"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.crosstab(car_df['foreign'], car_df['headroom'], margins=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Stata graphs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note:** due to a limitation of Stata it currently returns the graph as a PDF.  \n",
    "This is a temporary workaround that I hope to find a more suitable fix for in the future."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "(highschool and beyond (200 cases))\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "\n",
       "        <iframe\n",
       "            width=\"700\"\n",
       "            height=\"400\"\n",
       "            src=\"./temp_graph.pdf\"\n",
       "            frameborder=\"0\"\n",
       "            allowfullscreen\n",
       "        ></iframe>\n",
       "        "
      ],
      "text/plain": [
       "<IPython.lib.display.IFrame at 0x1dab56635c8>"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%stata -gr\n",
    "use https://stats.idre.ucla.edu/stat/data/hsb2.dta, clear\n",
    "graph twoway scatter read math"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use Python lists as Stata macros"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In many situations it is convenient to define values or variable names in a Python list or equivalently in a Stata macro.  \n",
    "The **`-i or --input`** argument makes a Python list available for use in Stata as a local macro.  \n",
    "For example, **`-i main_var`** converts the Python list **`['mpg', 'rep78']`** into the following Stata macro: **``main_var'`**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "main_var = ['mpg', 'rep78']\n",
    "control_var = ['gear_ratio', 'trunk', 'weight', 'displacement']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "mpg rep78\n",
      "gear_ratio trunk weight displacement\n",
      "\n",
      "Linear regression                               Number of obs     =         69\n",
      "                                                F(6, 62)          =       8.60\n",
      "                                                Prob > F          =     0.0000\n",
      "                                                R-squared         =     0.4124\n",
      "                                                Root MSE          =     2338.1\n",
      "\n",
      "------------------------------------------------------------------------------\n",
      "             |               Robust\n",
      "       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
      "-------------+----------------------------------------------------------------\n",
      "         mpg |  -76.95578   84.95038    -0.91   0.369    -246.7692     92.8576\n",
      "       rep78 |   899.0818   299.7541     3.00   0.004      299.882    1498.282\n",
      "  gear_ratio |   1479.744   917.5363     1.61   0.112    -354.3846    3313.873\n",
      "       trunk |  -110.3163   80.16622    -1.38   0.174    -270.5663    49.93365\n",
      "      weight |   1.139509   1.187361     0.96   0.341    -1.233991     3.51301\n",
      "displacement |   17.82274   8.523647     2.09   0.041     .7842094    34.86126\n",
      "       _cons |  -5163.323   4965.389    -1.04   0.302    -15088.99    4762.348\n",
      "------------------------------------------------------------------------------\n"
     ]
    }
   ],
   "source": [
    "%%stata -d car_df -i main_var -i control_var\n",
    "\n",
    "display \"`main_var'\"\n",
    "display \"`control_var'\"\n",
    "\n",
    "regress price `main_var' `control_var', vce(robust)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Modify dataset in Stata and return it to Python"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is possible create new variables or modify the existing dataset in Stata and have it returned as a Pandas dataframe.  \n",
    "In the example below the output **`-o car_df`** will overwrite the data **`-d car_df`**, effectively modifying the dataframe in place.  \n",
    "Note, the argument **`-np or --noprint`** can be used to supress any output below the code cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%stata -d car_df -o car_df -np\n",
    "\n",
    "generate weight_squared = weight^2\n",
    "generate log_weight = log(weight)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>make</th>\n",
       "      <th>price</th>\n",
       "      <th>mpg</th>\n",
       "      <th>rep78</th>\n",
       "      <th>headroom</th>\n",
       "      <th>trunk</th>\n",
       "      <th>weight</th>\n",
       "      <th>length</th>\n",
       "      <th>turn</th>\n",
       "      <th>displacement</th>\n",
       "      <th>gear_ratio</th>\n",
       "      <th>foreign</th>\n",
       "      <th>weight_squared</th>\n",
       "      <th>log_weight</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>AMC Concord</td>\n",
       "      <td>4099</td>\n",
       "      <td>22</td>\n",
       "      <td>3.0</td>\n",
       "      <td>2.5</td>\n",
       "      <td>11</td>\n",
       "      <td>2930</td>\n",
       "      <td>186</td>\n",
       "      <td>40</td>\n",
       "      <td>121</td>\n",
       "      <td>3.58</td>\n",
       "      <td>Domestic</td>\n",
       "      <td>8584900.0</td>\n",
       "      <td>7.982758</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>AMC Pacer</td>\n",
       "      <td>4749</td>\n",
       "      <td>17</td>\n",
       "      <td>3.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>11</td>\n",
       "      <td>3350</td>\n",
       "      <td>173</td>\n",
       "      <td>40</td>\n",
       "      <td>258</td>\n",
       "      <td>2.53</td>\n",
       "      <td>Domestic</td>\n",
       "      <td>11222500.0</td>\n",
       "      <td>8.116715</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>AMC Spirit</td>\n",
       "      <td>3799</td>\n",
       "      <td>22</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3.0</td>\n",
       "      <td>12</td>\n",
       "      <td>2640</td>\n",
       "      <td>168</td>\n",
       "      <td>35</td>\n",
       "      <td>121</td>\n",
       "      <td>3.08</td>\n",
       "      <td>Domestic</td>\n",
       "      <td>6969600.0</td>\n",
       "      <td>7.878534</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          make  price  mpg  rep78  headroom  trunk  weight  length  turn  \\\n",
       "0  AMC Concord   4099   22    3.0       2.5     11    2930     186    40   \n",
       "1    AMC Pacer   4749   17    3.0       3.0     11    3350     173    40   \n",
       "2   AMC Spirit   3799   22    NaN       3.0     12    2640     168    35   \n",
       "\n",
       "   displacement  gear_ratio   foreign  weight_squared  log_weight  \n",
       "0           121        3.58  Domestic       8584900.0    7.982758  \n",
       "1           258        2.53  Domestic      11222500.0    8.116715  \n",
       "2           121        3.08  Domestic       6969600.0    7.878534  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "car_df.head(3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set a custom working directory for this Stata code cell"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using a directory defined in a variable (this is useful if you need it for many cells)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "directory = '~/sandbox'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%stata -cwd directory -np\n",
    "display \"`c(pwd)'\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### It is also possible to provide the directory as an argument"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%stata -cwd '~/sandbox' -np\n",
    "display \"`c(pwd)'\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## An example case"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create the variable **`large`** in Python and use it as the dependent variable for a binary choice estimation by Stata."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "car_df['large'] = [1 if x > 3 and y > 200 else 0 for x, y in zip(car_df['headroom'], car_df['length'])]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>headroom</th>\n",
       "      <th>length</th>\n",
       "      <th>large</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2.5</td>\n",
       "      <td>186</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3.0</td>\n",
       "      <td>173</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3.0</td>\n",
       "      <td>168</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4.5</td>\n",
       "      <td>196</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4.0</td>\n",
       "      <td>222</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>4.0</td>\n",
       "      <td>218</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>3.0</td>\n",
       "      <td>170</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   headroom  length  large\n",
       "0       2.5     186      0\n",
       "1       3.0     173      0\n",
       "2       3.0     168      0\n",
       "3       4.5     196      0\n",
       "4       4.0     222      1\n",
       "5       4.0     218      1\n",
       "6       3.0     170      0"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "car_df[['headroom', 'length', 'large']].head(7)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Iteration 0:   log pseudolikelihood =  -39.60355  \n",
      "Iteration 1:   log pseudolikelihood = -19.307161  \n",
      "Iteration 2:   log pseudolikelihood = -13.526857  \n",
      "Iteration 3:   log pseudolikelihood = -10.999644  \n",
      "Iteration 4:   log pseudolikelihood = -10.726345  \n",
      "Iteration 5:   log pseudolikelihood = -10.723111  \n",
      "Iteration 6:   log pseudolikelihood = -10.723109  \n",
      "Iteration 7:   log pseudolikelihood = -10.723109  \n",
      "\n",
      "Logistic regression                             Number of obs     =         69\n",
      "                                                Wald chi2(6)      =      12.90\n",
      "                                                Prob > chi2       =     0.0446\n",
      "Log pseudolikelihood = -10.723109               Pseudo R2         =     0.7292\n",
      "\n",
      "                                  (Std. Err. adjusted for 69 clusters in make)\n",
      "------------------------------------------------------------------------------\n",
      "             |               Robust\n",
      "       large |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]\n",
      "-------------+----------------------------------------------------------------\n",
      "         mpg |  -.5846335   .2941083    -1.99   0.047    -1.161075   -.0081918\n",
      "       rep78 |  -1.298127   1.264918    -1.03   0.305    -3.777322    1.181067\n",
      "  gear_ratio |  -1.331913   3.389448    -0.39   0.694    -7.975109    5.311283\n",
      "       trunk |   1.210178   .4830082     2.51   0.012     .2634991    2.156856\n",
      "      weight |  -.0007284   .0022358    -0.33   0.745    -.0051105    .0036536\n",
      "displacement |    .001631   .0160425     0.10   0.919    -.0298119    .0330738\n",
      "       _cons |  -.2977676    16.7841    -0.02   0.986    -33.19401    32.59847\n",
      "------------------------------------------------------------------------------\n",
      "Note: 8 failures and 0 successes completely determined.\n"
     ]
    }
   ],
   "source": [
    "%%stata -d car_df -i main_var -i control_var\n",
    "\n",
    "logit large `main_var' `control_var', vce(cluster make)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}