{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How Many Transit Riders in California have Autos?\n",
    "\n",
    "Questioner: Gillian Gillett  \n",
    "March 23, 2022"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup Environment\n",
    "\n",
    "! Warning: will install libraries into current environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "DEBUG:root:test\n"
     ]
    }
   ],
   "source": [
    "import logging\n",
    "\n",
    "logging.basicConfig(level=logging.DEBUG)\n",
    "logging.debug(\"test\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [],
   "source": [
    "try:\n",
    "    import pandas as pd\n",
    "    import seaborn as sns\n",
    "    import matplotlib.pyplot as plt\n",
    "except:\n",
    "    logging.info('pandas seaborn not found. Will try and install into current environment')\n",
    "    ! conda install pandas seaborn \n",
    "    import pandas as pd\n",
    "    import seaborn as sns\n",
    "    import matplotlib.pyplot as plt\n",
    "\n",
    "pd.set_option(\"display.max.columns\", None)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:root:Working directory: /Users/elizabeth/Documents/urbanlabs/CA_Interoperable/working/data-analyses\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "WORKING_DIR = os.path.dirname(os.getcwd())\n",
    "\n",
    "logging.info(f\"Working directory: {WORKING_DIR}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## National Household Travel Survey\n",
    "\n",
    "Data Codebook: [https://nhts.ornl.gov/tables09/CodebookBrowser.aspx](https://nhts.ornl.gov/tables09/CodebookBrowser.aspx)\n",
    "\n",
    "Relevant variables:\n",
    "\n",
    "- HOUSEID\tHousehold Identifier  \n",
    "- HBHUR\tUrban / Rural indicator - Block group  \n",
    "- HHFAMINC\tHousehold income  \n",
    "- HHSIZE\tCount of household members  \n",
    "- HHSTATE\tHousehold state\n",
    "- HHVEHCNT\tCount of Household vehicles\n",
    "- WRKCOUNT\tNumber of workers in household\n",
    "- WTHHFIN\tFinal HH weight\n",
    "\n",
    "- CAR\tFrequency of Personal Vehicle Use for Travel\n",
    "- BUS\tFrequency of Bus Use for Travel\n",
    "- PARA\tFrequency of Paratransit Use for Travel\n",
    "- TAXI\tFrequency of Taxi Service or Rideshare Use for Travel\n",
    "- [WALK](https://nhts.ornl.gov/tables09/CodebookPage.aspx?id=1365) Frequency of Walk Use for Travel\n",
    "- TRAIN\tFrequency of Train Use for Travel\n",
    "\n",
    "- PLACE\tTravel is a Financial Burden\n",
    "- PTRANS\tPublic Transportation to Reduce Financial Burden of Travel\n",
    "- WALK2SAVE\tWalk to Reduce Financial Burden of Travel"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Import Data\n",
    "\n",
    "Assumes you have downloaded and unzipped NHTS data and weights into `csv` and `ReplicatesCSV` folders respectfully."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 150,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:root:NHTS Data directory: /Users/elizabeth/Documents/urbanlabs/CA_Interoperable/working/NHTS\n"
     ]
    }
   ],
   "source": [
    "NHTS_DATA_DIR = os.path.join(os.path.dirname(WORKING_DIR), 'NHTS')\n",
    "logging.info(f\"NHTS Data directory: {NHTS_DATA_DIR}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>HOUSEID</th>\n",
       "      <th>HHSIZE</th>\n",
       "      <th>HHVEHCNT</th>\n",
       "      <th>HHFAMINC</th>\n",
       "      <th>BUS</th>\n",
       "      <th>TRAIN</th>\n",
       "      <th>PLACE</th>\n",
       "      <th>HHSTATE</th>\n",
       "      <th>WTHHFIN</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>30000041</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>11</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>CA</td>\n",
       "      <td>788.614240</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>30000085</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>CA</td>\n",
       "      <td>190.669041</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>30000094</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>163.382292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>30000155</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>-7</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>120.772451</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>30000227</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>6</td>\n",
       "      <td>-9</td>\n",
       "      <td>-9</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>62.015790</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     HOUSEID  HHSIZE  HHVEHCNT  HHFAMINC  BUS  TRAIN  PLACE HHSTATE  \\\n",
       "6   30000041       2         2        11    4      4      5      CA   \n",
       "9   30000085       1         2         9    5      5      3      CA   \n",
       "11  30000094       1         1         4    5      5      2      CA   \n",
       "19  30000155       1         2        -7    5      4      2      CA   \n",
       "23  30000227       2         2         6   -9     -9      2      CA   \n",
       "\n",
       "       WTHHFIN  \n",
       "6   788.614240  \n",
       "9   190.669041  \n",
       "11  163.382292  \n",
       "19  120.772451  \n",
       "23   62.015790  "
      ]
     },
     "execution_count": 81,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cols_to_keep = [\n",
    "    'HOUSEID',\n",
    "    'HHSIZE',\n",
    "    'HHVEHCNT',\n",
    "    'HHFAMINC',\n",
    "    'BUS',\n",
    "    'TRAIN',\n",
    "    'WTHHFIN',\n",
    "    'PLACE',\n",
    "    'HHSTATE',\n",
    "]\n",
    "\n",
    "hh_all_df = pd.read_csv(\n",
    "    os.path.join(NHTS_DATA_DIR, 'csv','hhpub.csv'),\n",
    "    usecols=cols_to_keep,\n",
    ")\n",
    "hh_all_df = hh_all_df[hh_all_df['HHSTATE'] == 'CA']\n",
    "\n",
    "hh_all_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Analyze NHTS Data\n",
    "\n",
    "#### Recode Variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {},
   "outputs": [],
   "source": [
    "def is_transit_user(x):\n",
    "    _FREQ_TRANSIT = [1,2,3]\n",
    "    _INFREQ_TRANSIT = [4,5]\n",
    "\n",
    "    # UNKOWN\n",
    "    if int(x['BUS']) < 0 or int(x['TRAIN']) < 0:\n",
    "        return \"Unknown\"\n",
    "    # NO\n",
    "    if int(x['BUS']) == 5 and int(x['TRAIN']) == 5:\n",
    "        return \"No Transit Use\"\n",
    "    # YES\n",
    "    if int(x['BUS']) in _FREQ_TRANSIT or int(x['TRAIN']) in _FREQ_TRANSIT:\n",
    "        return \"Frequent Transit\"\n",
    "    if int(x['BUS']) in _FREQ_TRANSIT+_INFREQ_TRANSIT or int(x['TRAIN']) in _FREQ_TRANSIT+_INFREQ_TRANSIT:\n",
    "        return \"Infrequent Transit\"\n",
    "    else:\n",
    "        logging.debug(f\"Unable to process row for is_transit_user:\\n {x}\")\n",
    "        raise Exception(f'Unable to determine if transit user for row: {x}')\n",
    "\n",
    "def has_hh_veh(x):\n",
    "    # UNKOWN\n",
    "    if int(x['HHVEHCNT']) < 0:\n",
    "        return \"Unknown\"\n",
    "    # NO\n",
    "    if int(x['HHVEHCNT']) == 0:\n",
    "        return \"No Vehicle\"\n",
    "    # YES\n",
    "    if int(x['HHVEHCNT']) > 0:\n",
    "        return \"Has vehicle\"\n",
    "    else:\n",
    "        logging.debug(f\"Unable to process row for has_hh_veh:\\n {x}\")\n",
    "        raise Exception(f'Unable to determine if household has vehicles for row: {x}')\n",
    "\n",
    "def travel_burden_hh(x):\n",
    "    BURDEN = [1,2]\n",
    "    NOT_BURDEN = [3,4,5]\n",
    "    # UNKOWN\n",
    "    if int(x['PLACE']) < 0:\n",
    "        return \"Unknown\"\n",
    "    # NO\n",
    "    if int(x['PLACE']) in NOT_BURDEN:\n",
    "        return \"Not Burdened\"\n",
    "    # YES\n",
    "    if int(x['PLACE']) in BURDEN:\n",
    "        return \"Burdened\"\n",
    "    else:\n",
    "        logging.debug(f\"Unable to process row for travel_burden_hh:\\n {x}\")\n",
    "        raise Exception(f'Unable to determine if household has financial burden to travel for row: {x}')\n",
    "        \n",
    "def filter_recs(x):\n",
    "    FILTER_COLS = ['transit_hh','vehicle_hh','burden_hh']\n",
    "    FILTER_VALUE = \"Unknown\"\n",
    "    if FILTER_VALUE in [x[c] for c in FILTER_COLS]:\n",
    "        return -1\n",
    "    return 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "DEBUG:root:Excluded records:\n",
      "    transit_hh   vehicle_hh     burden_hh\n",
      "23     Unknown  Has vehicle      Burdened\n",
      "106    Unknown  Has vehicle       Unknown\n",
      "141    Unknown  Has vehicle  Not Burdened\n",
      "211    Unknown  Has vehicle       Unknown\n",
      "221    Unknown  Has vehicle       Unknown\n",
      "/var/folders/60/xd2kny110pxfz3ln611jq7hm0000gn/T/ipykernel_8795/143102953.py:12: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  hh_df[c]= pd.Categorical(\n",
      "DEBUG:root:Cleaned records:\n",
      "            transit_hh   vehicle_hh     burden_hh\n",
      "6   Infrequent Transit  Has vehicle  Not Burdened\n",
      "9       No Transit Use  Has vehicle  Not Burdened\n",
      "11      No Transit Use  Has vehicle      Burdened\n",
      "19  Infrequent Transit  Has vehicle      Burdened\n",
      "37      No Transit Use  Has vehicle      Burdened\n",
      "INFO:root:Cleaned records filtered to exclude 2927(11.2%) of 26099 records\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>HOUSEID</th>\n",
       "      <th>HHSIZE</th>\n",
       "      <th>HHVEHCNT</th>\n",
       "      <th>HHFAMINC</th>\n",
       "      <th>BUS</th>\n",
       "      <th>TRAIN</th>\n",
       "      <th>PLACE</th>\n",
       "      <th>HHSTATE</th>\n",
       "      <th>WTHHFIN</th>\n",
       "      <th>transit_hh</th>\n",
       "      <th>vehicle_hh</th>\n",
       "      <th>burden_hh</th>\n",
       "      <th>keep</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>30000041</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>11</td>\n",
       "      <td>4</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>CA</td>\n",
       "      <td>788.614240</td>\n",
       "      <td>Infrequent Transit</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Not Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>30000085</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>CA</td>\n",
       "      <td>190.669041</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Not Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>30000094</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>163.382292</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>19</th>\n",
       "      <td>30000155</td>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>-7</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>120.772451</td>\n",
       "      <td>Infrequent Transit</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>23</th>\n",
       "      <td>30000227</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>6</td>\n",
       "      <td>-9</td>\n",
       "      <td>-9</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>62.015790</td>\n",
       "      <td>Unknown</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Burdened</td>\n",
       "      <td>-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129679</th>\n",
       "      <td>40794135</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>CA</td>\n",
       "      <td>63.217848</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Not Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129682</th>\n",
       "      <td>40794179</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>6</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "      <td>CA</td>\n",
       "      <td>377.126813</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129685</th>\n",
       "      <td>40794233</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>8</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>3</td>\n",
       "      <td>CA</td>\n",
       "      <td>33.421852</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Not Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129691</th>\n",
       "      <td>40794291</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>9</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>CA</td>\n",
       "      <td>41.869638</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Not Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129693</th>\n",
       "      <td>40794294</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>10</td>\n",
       "      <td>5</td>\n",
       "      <td>5</td>\n",
       "      <td>4</td>\n",
       "      <td>CA</td>\n",
       "      <td>207.672765</td>\n",
       "      <td>No Transit Use</td>\n",
       "      <td>Has vehicle</td>\n",
       "      <td>Not Burdened</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>26099 rows × 13 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "         HOUSEID  HHSIZE  HHVEHCNT  HHFAMINC  BUS  TRAIN  PLACE HHSTATE  \\\n",
       "6       30000041       2         2        11    4      4      5      CA   \n",
       "9       30000085       1         2         9    5      5      3      CA   \n",
       "11      30000094       1         1         4    5      5      2      CA   \n",
       "19      30000155       1         2        -7    5      4      2      CA   \n",
       "23      30000227       2         2         6   -9     -9      2      CA   \n",
       "...          ...     ...       ...       ...  ...    ...    ...     ...   \n",
       "129679  40794135       2         3         5    5      5      3      CA   \n",
       "129682  40794179       1         1         6    5      5      2      CA   \n",
       "129685  40794233       2         3         8    5      5      3      CA   \n",
       "129691  40794291       1         1         9    5      5      5      CA   \n",
       "129693  40794294       2         2        10    5      5      4      CA   \n",
       "\n",
       "           WTHHFIN          transit_hh   vehicle_hh     burden_hh  keep  \n",
       "6       788.614240  Infrequent Transit  Has vehicle  Not Burdened     1  \n",
       "9       190.669041      No Transit Use  Has vehicle  Not Burdened     1  \n",
       "11      163.382292      No Transit Use  Has vehicle      Burdened     1  \n",
       "19      120.772451  Infrequent Transit  Has vehicle      Burdened     1  \n",
       "23       62.015790             Unknown  Has vehicle      Burdened    -1  \n",
       "...            ...                 ...          ...           ...   ...  \n",
       "129679   63.217848      No Transit Use  Has vehicle  Not Burdened     1  \n",
       "129682  377.126813      No Transit Use  Has vehicle      Burdened     1  \n",
       "129685   33.421852      No Transit Use  Has vehicle  Not Burdened     1  \n",
       "129691   41.869638      No Transit Use  Has vehicle  Not Burdened     1  \n",
       "129693  207.672765      No Transit Use  Has vehicle  Not Burdened     1  \n",
       "\n",
       "[26099 rows x 13 columns]"
      ]
     },
     "execution_count": 131,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "analysis_cols = ['transit_hh','vehicle_hh','burden_hh']\n",
    "hh_all_df['transit_hh'] = hh_all_df.apply(lambda x: is_transit_user(x), axis=1)\n",
    "hh_all_df['vehicle_hh'] = hh_all_df.apply(lambda x: has_hh_veh(x), axis=1)\n",
    "hh_all_df['burden_hh'] = hh_all_df.apply(lambda x: travel_burden_hh(x), axis=1)\n",
    "hh_all_df['keep'] = hh_all_df.apply(lambda x: filter_recs(x) ,axis=1)\n",
    "\n",
    "logging.debug(f\"Excluded records:\\n{hh_all_df[hh_all_df['keep']<0][analysis_cols].head()}\")\n",
    "\n",
    "hh_df = hh_all_df[hh_all_df['keep']>0]\n",
    "\n",
    "for c in analysis_cols:\n",
    "    hh_df[c]= pd.Categorical(\n",
    "        hh_df[c],\n",
    "        ordered = True,\n",
    "    )\n",
    "    \n",
    "logging.debug(f\"Cleaned records:\\n{hh_df[analysis_cols].head()}\")\n",
    "\n",
    "recs_exc = len(hh_all_df)-len(hh_df)\n",
    "logging.info(f\"Cleaned records filtered to exclude {recs_exc}({round(100*recs_exc/len(hh_all_df),1)}%) of {len(hh_all_df)} records\")\n",
    "hh_all_df\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 194,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_e2db7_row0_col0 {\n",
       "  background-color: #faf2f8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_e2db7_row0_col1 {\n",
       "  background-color: #03517e;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e2db7_row0_col2, #T_e2db7_row2_col1, #T_e2db7_row2_col2, #T_e2db7_row3_col0 {\n",
       "  background-color: #023858;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e2db7_row1_col0 {\n",
       "  background-color: #023b5d;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e2db7_row1_col1 {\n",
       "  background-color: #f7f0f7;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_e2db7_row1_col2, #T_e2db7_row2_col0, #T_e2db7_row3_col1 {\n",
       "  background-color: #fff7fb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_e2db7_row3_col2 {\n",
       "  background-color: #fdf5fa;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_e2db7_row4_col0 {\n",
       "  background-color: #f3edf5;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_e2db7_row4_col1 {\n",
       "  background-color: #045483;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_e2db7_row4_col2 {\n",
       "  background-color: #034973;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_e2db7\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank\" >&nbsp;</th>\n",
       "      <th class=\"index_name level0\" >HHold Transit Use in California (Source: NHTS 2017)</th>\n",
       "      <th id=\"T_e2db7_level0_col0\" class=\"col_heading level0 col0\" >Frequent Transit</th>\n",
       "      <th id=\"T_e2db7_level0_col1\" class=\"col_heading level0 col1\" >Infrequent Transit</th>\n",
       "      <th id=\"T_e2db7_level0_col2\" class=\"col_heading level0 col2\" >No Transit Use</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th class=\"index_name level0\" >HHold Transportation Burden</th>\n",
       "      <th class=\"index_name level1\" >HHold Vehicles</th>\n",
       "      <th class=\"blank col0\" >&nbsp;</th>\n",
       "      <th class=\"blank col1\" >&nbsp;</th>\n",
       "      <th class=\"blank col2\" >&nbsp;</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_e2db7_level0_row0\" class=\"row_heading level0 row0\" rowspan=\"2\">Burdened</th>\n",
       "      <th id=\"T_e2db7_level1_row0\" class=\"row_heading level1 row0\" >Has vehicle</th>\n",
       "      <td id=\"T_e2db7_row0_col0\" class=\"data row0 col0\" >17.7%</td>\n",
       "      <td id=\"T_e2db7_row0_col1\" class=\"data row0 col1\" >31.0%</td>\n",
       "      <td id=\"T_e2db7_row0_col2\" class=\"data row0 col2\" >51.3%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e2db7_level1_row1\" class=\"row_heading level1 row1\" >No Vehicle</th>\n",
       "      <td id=\"T_e2db7_row1_col0\" class=\"data row1 col0\" >76.1%</td>\n",
       "      <td id=\"T_e2db7_row1_col1\" class=\"data row1 col1\" >9.2%</td>\n",
       "      <td id=\"T_e2db7_row1_col2\" class=\"data row1 col2\" >14.7%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e2db7_level0_row2\" class=\"row_heading level0 row2\" rowspan=\"2\">Not Burdened</th>\n",
       "      <th id=\"T_e2db7_level1_row2\" class=\"row_heading level1 row2\" >Has vehicle</th>\n",
       "      <td id=\"T_e2db7_row2_col0\" class=\"data row2 col0\" >15.3%</td>\n",
       "      <td id=\"T_e2db7_row2_col1\" class=\"data row2 col1\" >33.3%</td>\n",
       "      <td id=\"T_e2db7_row2_col2\" class=\"data row2 col2\" >51.4%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e2db7_level1_row3\" class=\"row_heading level1 row3\" >No Vehicle</th>\n",
       "      <td id=\"T_e2db7_row3_col0\" class=\"data row3 col0\" >77.0%</td>\n",
       "      <td id=\"T_e2db7_row3_col1\" class=\"data row3 col1\" >7.8%</td>\n",
       "      <td id=\"T_e2db7_row3_col2\" class=\"data row3 col2\" >15.3%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_e2db7_level0_row4\" class=\"row_heading level0 row4\" >All</th>\n",
       "      <th id=\"T_e2db7_level1_row4\" class=\"row_heading level1 row4\" ></th>\n",
       "      <td id=\"T_e2db7_row4_col0\" class=\"data row4 col0\" >20.3%</td>\n",
       "      <td id=\"T_e2db7_row4_col1\" class=\"data row4 col1\" >30.7%</td>\n",
       "      <td id=\"T_e2db7_row4_col2\" class=\"data row4 col2\" >49.0%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x169f90220>"
      ]
     },
     "execution_count": 194,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hh_df_3way_ct = pd.crosstab(\n",
    "    [hh_df['burden_hh'],hh_df['vehicle_hh']],\n",
    "    hh_df['transit_hh'], \n",
    "    values = hh_df['WTHHFIN'], \n",
    "    aggfunc = sum,\n",
    "    normalize='index',\n",
    "    margins = True,\n",
    "    rownames=['HHold Transportation Burden','HHold Vehicles'], \n",
    "    colnames=['HHold Transit Use in California (Source: NHTS 2017)'],\n",
    ")\n",
    "s_hh_df_3way_ct=hh_df_3way_ct.style\\\n",
    "    .background_gradient()\\\n",
    "    .format(\"{:.1%}\")\n",
    "    \n",
    "s_hh_df_3way_ct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 195,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_8780b_row0_col0, #T_8780b_row1_col1, #T_8780b_row1_col2 {\n",
       "  background-color: #fff7fb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8780b_row0_col1, #T_8780b_row0_col2, #T_8780b_row1_col0 {\n",
       "  background-color: #023858;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8780b_row2_col0 {\n",
       "  background-color: #f5eff6;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8780b_row2_col1, #T_8780b_row2_col2 {\n",
       "  background-color: #034973;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_8780b\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"index_name level0\" >HHold Transit Use in California (Source: NHTS 2017)</th>\n",
       "      <th id=\"T_8780b_level0_col0\" class=\"col_heading level0 col0\" >Frequent Transit</th>\n",
       "      <th id=\"T_8780b_level0_col1\" class=\"col_heading level0 col1\" >Infrequent Transit</th>\n",
       "      <th id=\"T_8780b_level0_col2\" class=\"col_heading level0 col2\" >No Transit Use</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th class=\"index_name level0\" >HHold Vehicles</th>\n",
       "      <th class=\"blank col0\" >&nbsp;</th>\n",
       "      <th class=\"blank col1\" >&nbsp;</th>\n",
       "      <th class=\"blank col2\" >&nbsp;</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_8780b_level0_row0\" class=\"row_heading level0 row0\" >Has vehicle</th>\n",
       "      <td id=\"T_8780b_row0_col0\" class=\"data row0 col0\" >16.3%</td>\n",
       "      <td id=\"T_8780b_row0_col1\" class=\"data row0 col1\" >32.3%</td>\n",
       "      <td id=\"T_8780b_row0_col2\" class=\"data row0 col2\" >51.4%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8780b_level0_row1\" class=\"row_heading level0 row1\" >No Vehicle</th>\n",
       "      <td id=\"T_8780b_row1_col0\" class=\"data row1 col0\" >76.5%</td>\n",
       "      <td id=\"T_8780b_row1_col1\" class=\"data row1 col1\" >8.5%</td>\n",
       "      <td id=\"T_8780b_row1_col2\" class=\"data row1 col2\" >15.0%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8780b_level0_row2\" class=\"row_heading level0 row2\" >All</th>\n",
       "      <td id=\"T_8780b_row2_col0\" class=\"data row2 col0\" >20.3%</td>\n",
       "      <td id=\"T_8780b_row2_col1\" class=\"data row2 col1\" >30.7%</td>\n",
       "      <td id=\"T_8780b_row2_col2\" class=\"data row2 col2\" >49.0%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x16a858460>"
      ]
     },
     "execution_count": 195,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hh_df_ct = pd.crosstab(\n",
    "    hh_df['vehicle_hh'],\n",
    "    hh_df['transit_hh'], \n",
    "    values = hh_df['WTHHFIN'], \n",
    "    aggfunc = sum,\n",
    "    normalize='index',\n",
    "    margins = True,\n",
    "    rownames=['HHold Vehicles'], \n",
    "    colnames=['HHold Transit Use in California (Source: NHTS 2017)'],\n",
    ")\n",
    "s_hh_df_ct=hh_df_ct.style\\\n",
    "    .background_gradient()\\\n",
    "    .format(\"{:.1%}\")\n",
    "\n",
    "s_hh_df_ct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 196,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_9b521_row0_col0, #T_9b521_row1_col1 {\n",
       "  background-color: #fff7fb;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_9b521_row0_col1, #T_9b521_row1_col0 {\n",
       "  background-color: #023858;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_9b521_row2_col0 {\n",
       "  background-color: #023a5b;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_9b521_row2_col1 {\n",
       "  background-color: #fef6fa;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_9b521_row3_col0 {\n",
       "  background-color: #0568a3;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_9b521_row3_col1 {\n",
       "  background-color: #dad9ea;\n",
       "  color: #000000;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_9b521\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"index_name level0\" >HHold Vehicles</th>\n",
       "      <th id=\"T_9b521_level0_col0\" class=\"col_heading level0 col0\" >Has vehicle</th>\n",
       "      <th id=\"T_9b521_level0_col1\" class=\"col_heading level0 col1\" >No Vehicle</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th class=\"index_name level0\" >HHold Transit Use in California (Source: NHTS 2017)</th>\n",
       "      <th class=\"blank col0\" >&nbsp;</th>\n",
       "      <th class=\"blank col1\" >&nbsp;</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_9b521_level0_row0\" class=\"row_heading level0 row0\" >Frequent Transit</th>\n",
       "      <td id=\"T_9b521_row0_col0\" class=\"data row0 col0\" >75.2%</td>\n",
       "      <td id=\"T_9b521_row0_col1\" class=\"data row0 col1\" >24.8%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_9b521_level0_row1\" class=\"row_heading level0 row1\" >Infrequent Transit</th>\n",
       "      <td id=\"T_9b521_row1_col0\" class=\"data row1 col0\" >98.2%</td>\n",
       "      <td id=\"T_9b521_row1_col1\" class=\"data row1 col1\" >1.8%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_9b521_level0_row2\" class=\"row_heading level0 row2\" >No Transit Use</th>\n",
       "      <td id=\"T_9b521_row2_col0\" class=\"data row2 col0\" >98.0%</td>\n",
       "      <td id=\"T_9b521_row2_col1\" class=\"data row2 col1\" >2.0%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_9b521_level0_row3\" class=\"row_heading level0 row3\" >All</th>\n",
       "      <td id=\"T_9b521_row3_col0\" class=\"data row3 col0\" >93.4%</td>\n",
       "      <td id=\"T_9b521_row3_col1\" class=\"data row3 col1\" >6.6%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x16a938850>"
      ]
     },
     "execution_count": 196,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hh_df_ct_r = pd.crosstab(\n",
    "    hh_df['transit_hh'], \n",
    "    hh_df['vehicle_hh'],\n",
    "    values = hh_df['WTHHFIN'], \n",
    "    aggfunc = sum,\n",
    "    normalize='index',\n",
    "    margins = True,\n",
    "    colnames=['HHold Vehicles'], \n",
    "    rownames=['HHold Transit Use in California (Source: NHTS 2017)'],\n",
    ")\n",
    "s_hh_df_ct_r=hh_df_ct_r.style\\\n",
    "    .background_gradient()\\\n",
    "    .format(\"{:.1%}\")\n",
    "\n",
    "s_hh_df_ct_r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Export Results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 207,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Export to markdown\n",
    "\n",
    "try: \n",
    "    import tabulate\n",
    "except:\n",
    "    !pip install tabulate\n",
    "    \n",
    "with open(\"results.md\",\"w\") as f:\n",
    "    f.write(\"\\n\\n**Which transit-using households have vehicles?**\\n\\n\")\n",
    "    f.write(hh_df_ct.to_markdown(floatfmt=\".1%\"))\n",
    "    f.write(\"\\n\\n**Which vehicle-owning households ride transit?**\\n\\n\")\n",
    "    f.write(hh_df_ct_r.to_markdown(floatfmt=\".1%\"))\n",
    "    f.write(\"\\n\\n**Does it vary among households who are financially burdened by transportation?**\\n\\n\")\n",
    "    f.write(hh_df_3way_ct.to_markdown(floatfmt=\".1%\"))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 209,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/elizabeth/opt/miniconda3/envs/calitp/lib/python3.10/site-packages/dataframe_image/_pandas_accessor.py:69: FutureWarning: this method is deprecated in favour of `Styler.to_html()`\n",
      "  html = '<div>' + obj.render() + '</div>'\n",
      "[0324/132220.507694:INFO:headless_shell.cc(659)] Written to file /var/folders/60/xd2kny110pxfz3ln611jq7hm0000gn/T/tmp2anbgsyh/temp.png.\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'iCCP' 41 295\n",
      "DEBUG:PIL.PngImagePlugin:iCCP profile name b'Skia'\n",
      "DEBUG:PIL.PngImagePlugin:Compression method 0\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 348 8192\n",
      "[0324/132222.411945:INFO:headless_shell.cc(659)] Written to file /var/folders/60/xd2kny110pxfz3ln611jq7hm0000gn/T/tmp3rt1v9vm/temp.png.\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'iCCP' 41 295\n",
      "DEBUG:PIL.PngImagePlugin:iCCP profile name b'Skia'\n",
      "DEBUG:PIL.PngImagePlugin:Compression method 0\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 348 8192\n",
      "[0324/132225.100427:INFO:headless_shell.cc(659)] Written to file /var/folders/60/xd2kny110pxfz3ln611jq7hm0000gn/T/tmppj7306m7/temp.png.\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'IHDR' 16 13\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'iCCP' 41 295\n",
      "DEBUG:PIL.PngImagePlugin:iCCP profile name b'Skia'\n",
      "DEBUG:PIL.PngImagePlugin:Compression method 0\n",
      "DEBUG:PIL.PngImagePlugin:STREAM b'IDAT' 348 8192\n"
     ]
    }
   ],
   "source": [
    "## Exporting to images\n",
    "try:\n",
    "    import dataframe_image as dfi\n",
    "except:\n",
    "    ! pip install dataframe_image\n",
    "    import dataframe_image as dfi\n",
    "\n",
    "dfi.export(s_hh_df_ct,\"hh_df_ct.png\")\n",
    "dfi.export(s_hh_df_ct_r,\"hh_df_ct_r.png\")\n",
    "dfi.export(s_hh_df_3way_ct,\"hh_df_3way_ct.png\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Plot Chord Chart\n",
    "# Currently getting an error with this >:(\n",
    "\n",
    "try:\n",
    "    from chord import Chord\n",
    "except:\n",
    "    ! pip install chord\n",
    "    from chord import Chord\n",
    "\n",
    "names=list(hh_df_ct.columns)\n",
    "matrix = hh_df_ct.values.tolist()\n",
    "ch=Chord(hh_df_ct,names)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 125,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAqsAAAKaCAYAAAAZPRD5AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAOnUlEQVR4nO3cz2vXBRzH8U23scq5tQ5RkSWKipBJSpcounSpyG4FEXXqFnQK+ivqHHTpDwiJqGvUsQg0IshCKCmNJVuusE39duoHZIZfw8/TL4/HbZ/vDq/jkzcfPtOj0WgKAACKtgw9AAAA/o1YBQAgS6wCAJAlVgEAyBKrAABkzVzpx5c/e+7T6zUE+MuDC98MPYExzU1fHHoCwA3pmd2fHL7cc5dVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCArJmhB/D/++DVj+899cmZxfnFuQvPH33qi6H3AACMy2V1Au0/smvlidcfOTH0DgCAayVWJ9A9D925ftPS/IWhdwAAXCuxCgBAllgFACBLrAIAkCVWAQDI8umqCfTeKx/uPP35Twu/nduYeeuxdw4cenH/9wef27cy9C4AgKslVifQk288enLoDQAA/wevAQAAkCVWAQDI8hoABL3/04H//J/Hbzt+HZZwtTZGW4eeADBRXFYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMgSqwAAZIlVAACyxCoAAFliFQCALLEKAECWWAUAIEusAgCQNXOlHz96+/D12gH8zf5nvxx6AgAkuKwCAJB1xcsqN6bVE8e2//DxuzumRqOppX2HVu546InTQ28CABiHy+qEGV26OPXDR0d37Hz6pa/2vvDaF2tfH1/+9cfv5ofeBQAwDrE6YdZPfXPL7MLyb/PLt29smZkdLe6+7+zaiWNLQ+8CABiHWJ0wm+dW52a3LW788ffstqWNzfWf54bcBAAwLrE6cUb/eDI9fZmHAAA3ALE6YWYXbt3YXF/785K6ub46N3PL9s0hNwEAjEusTphtd+36ZfPc2fnzZ8/MXbqwOb329efLi7vvXx16FwDAOHy6asJMb906dcfDR749efTNPVOjS1NLex9Yufn2u88PvQsAYBxidQIt7Tm4trTn4NrQOwAArpXXAAAAyBKrAABkTY9GvmoEAECTyyoAAFliFQCALLEKAECWWAUAIEusAgCQJVYBAMj6HVc/g0/PmESWAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 864x864 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Plot Tree Map\n",
    "# This is really ugly >:(\n",
    "try:\n",
    "    import squarify \n",
    "except:\n",
    "    ! pip install squarify \n",
    "    import squarify \n",
    "\n",
    "matrix = hh_df_ct.values.tolist()\n",
    "values = [item for sublist in matrix for item in sublist]\n",
    "\n",
    "names=list(hh_df_ct.columns)\n",
    "\n",
    "fig, ax = plt.subplots(1, figsize = (12,12))\n",
    "squarify.plot(sizes=values, \n",
    "              label=names, \n",
    "              alpha=.8 )\n",
    "plt.axis('off')\n",
    "plt.show()"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.2"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}