{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Created by: [SmirkyGraphs](https://smirkygraphs.github.io/). Code: [Github](https://github.com/SmirkyGraphs/Python-Notebooks). Source: [RI Legislature Site](http://www.rilin.state.ri.us/pages/legislation.aspx).\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Held for Further Study: Where Bills Go to Die"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Bills in Rhode Island face a big hurdle, making it out of committee and not being **\"Held for Further Study\"**. Once a bill is introduced and sent to committee, the first vote it recieves is whether or not it should be \"Held for Further Study\". A bill held for further study is indefinitely postpone, however can be considered again by the committee. Despite the naming, no study actually takes place, and if reintroduced, no study is prestended.\n",
    "\n",
    "This notebook will take a look at how many bills end up \"held for further study\" and which chamber they were started in from 2007-2019.\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('./data/clean/bill_actions.csv')\n",
    "\n",
    "# removes resolutions\n",
    "df = df[~df['bill_id'].str.contains('R')]\n",
    "\n",
    "# remove duplicate (created when splitting action_type)\n",
    "df = df.drop_duplicates(subset=(['action', 'lookup_id']))\n",
    "\n",
    "# adding action_num increments of bill\n",
    "df['action_num'] = df.groupby('lookup_id').cumcount()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "total held for study: 64% (16509)\n",
      "total died held for study: 63% (10473)\n"
     ]
    }
   ],
   "source": [
    "# total bills introduced\n",
    "total_bills = df['lookup_id'].nunique()\n",
    "\n",
    "# total bills ever \"held for further study\"\n",
    "held_for_study = df[df['type'] == 'held for further study']['lookup_id'].unique().tolist()\n",
    "\n",
    "# total bills that never left \"held for further study\"\n",
    "last_action = df.groupby('lookup_id')['type'].agg('last').reset_index()\n",
    "last_action = last_action[last_action['type'] == 'held for further study']\n",
    "died_in_study = last_action['lookup_id'].unique().tolist()\n",
    "\n",
    "# final calcs\n",
    "\n",
    "total_held_for_study = int(round(len(held_for_study)/total_bills, 2) * 100)\n",
    "total_died_held = int(round(len(died_in_study)/len(held_for_study), 2) * 100)\n",
    "\n",
    "print(f'total held for study: {total_held_for_study}% ({len(held_for_study)})')\n",
    "print(f'total died held for study: {total_died_held}% ({len(died_in_study)})')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bills Held & Died for Further Study by Year (Percents Compared to Total Bills)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>count_total</th>\n",
       "      <th>count_ever_held</th>\n",
       "      <th>(% total)_ever_held</th>\n",
       "      <th>count_died_held</th>\n",
       "      <th>(% total)_died_held</th>\n",
       "      <th>(% held)_died_held</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>session</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2007</th>\n",
       "      <td>2255</td>\n",
       "      <td>1275</td>\n",
       "      <td>56%</td>\n",
       "      <td>809</td>\n",
       "      <td>36%</td>\n",
       "      <td>63%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008</th>\n",
       "      <td>2159</td>\n",
       "      <td>1201</td>\n",
       "      <td>56%</td>\n",
       "      <td>779</td>\n",
       "      <td>36%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2009</th>\n",
       "      <td>1999</td>\n",
       "      <td>1124</td>\n",
       "      <td>56%</td>\n",
       "      <td>740</td>\n",
       "      <td>37%</td>\n",
       "      <td>66%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2010</th>\n",
       "      <td>1861</td>\n",
       "      <td>910</td>\n",
       "      <td>49%</td>\n",
       "      <td>618</td>\n",
       "      <td>33%</td>\n",
       "      <td>68%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2011</th>\n",
       "      <td>1926</td>\n",
       "      <td>1173</td>\n",
       "      <td>61%</td>\n",
       "      <td>767</td>\n",
       "      <td>40%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012</th>\n",
       "      <td>1908</td>\n",
       "      <td>1043</td>\n",
       "      <td>55%</td>\n",
       "      <td>651</td>\n",
       "      <td>34%</td>\n",
       "      <td>62%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2013</th>\n",
       "      <td>1907</td>\n",
       "      <td>1234</td>\n",
       "      <td>65%</td>\n",
       "      <td>708</td>\n",
       "      <td>37%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014</th>\n",
       "      <td>2036</td>\n",
       "      <td>1298</td>\n",
       "      <td>64%</td>\n",
       "      <td>772</td>\n",
       "      <td>38%</td>\n",
       "      <td>59%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2015</th>\n",
       "      <td>1896</td>\n",
       "      <td>1305</td>\n",
       "      <td>69%</td>\n",
       "      <td>835</td>\n",
       "      <td>44%</td>\n",
       "      <td>64%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016</th>\n",
       "      <td>2068</td>\n",
       "      <td>1473</td>\n",
       "      <td>71%</td>\n",
       "      <td>911</td>\n",
       "      <td>44%</td>\n",
       "      <td>62%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017</th>\n",
       "      <td>1963</td>\n",
       "      <td>1555</td>\n",
       "      <td>79%</td>\n",
       "      <td>961</td>\n",
       "      <td>49%</td>\n",
       "      <td>62%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2018</th>\n",
       "      <td>1935</td>\n",
       "      <td>1473</td>\n",
       "      <td>76%</td>\n",
       "      <td>978</td>\n",
       "      <td>51%</td>\n",
       "      <td>66%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>1788</td>\n",
       "      <td>1445</td>\n",
       "      <td>81%</td>\n",
       "      <td>944</td>\n",
       "      <td>53%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         count_total  count_ever_held (% total)_ever_held  count_died_held  \\\n",
       "session                                                                      \n",
       "2007            2255             1275                 56%              809   \n",
       "2008            2159             1201                 56%              779   \n",
       "2009            1999             1124                 56%              740   \n",
       "2010            1861              910                 49%              618   \n",
       "2011            1926             1173                 61%              767   \n",
       "2012            1908             1043                 55%              651   \n",
       "2013            1907             1234                 65%              708   \n",
       "2014            2036             1298                 64%              772   \n",
       "2015            1896             1305                 69%              835   \n",
       "2016            2068             1473                 71%              911   \n",
       "2017            1963             1555                 79%              961   \n",
       "2018            1935             1473                 76%              978   \n",
       "2019            1788             1445                 81%              944   \n",
       "\n",
       "        (% total)_died_held (% held)_died_held  \n",
       "session                                         \n",
       "2007                    36%                63%  \n",
       "2008                    36%                65%  \n",
       "2009                    37%                66%  \n",
       "2010                    33%                68%  \n",
       "2011                    40%                65%  \n",
       "2012                    34%                62%  \n",
       "2013                    37%                56%  \n",
       "2014                    38%                59%  \n",
       "2015                    44%                64%  \n",
       "2016                    44%                62%  \n",
       "2017                    49%                62%  \n",
       "2018                    51%                66%  \n",
       "2019                    53%                65%  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[df['lookup_id'].isin(died_in_study), 'ended_held_for_study'] = True\n",
    "df.loc[df['lookup_id'].isin(held_for_study), 'ever_held_for_study'] = True\n",
    "\n",
    "held = df.drop_duplicates('lookup_id')\n",
    "held = held.groupby('session')['lookup_id', 'ever_held_for_study', 'ended_held_for_study'].count()\n",
    "\n",
    "held['percent_ever_held'] = (round(held['ever_held_for_study']/held['lookup_id'], 2)*100).astype(int).astype(str) + '%'\n",
    "held['percent_ended_held'] = (round(held['ended_held_for_study']/held['lookup_id'], 2)*100).astype(int).astype(str) + '%'\n",
    "\n",
    "held['x'] = (round(held['ended_held_for_study']/held['ever_held_for_study'], 2)*100).astype(int).astype(str) + '%'\n",
    "\n",
    "held.columns = ['count_total', 'count_ever_held', 'count_died_held', \n",
    "                '(% total)_ever_held', '(% total)_died_held', '(% held)_died_held']\n",
    "\n",
    "cols = ['count_total','count_ever_held','(% total)_ever_held','count_died_held','(% total)_died_held', '(% held)_died_held']\n",
    "held = held.reindex(columns=cols)\n",
    "held"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bills Held & Died for Further Study by Year (Seperated by Chamber)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>count_total</th>\n",
       "      <th>count_ever_held</th>\n",
       "      <th>(% total)_ever_held</th>\n",
       "      <th>count_died_held</th>\n",
       "      <th>(% total)_died_held</th>\n",
       "      <th>(% held)_died_held</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>chamber_origin</th>\n",
       "      <th>session</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"13\" valign=\"top\">house</th>\n",
       "      <th>2007</th>\n",
       "      <td>1279</td>\n",
       "      <td>813</td>\n",
       "      <td>64%</td>\n",
       "      <td>519</td>\n",
       "      <td>41%</td>\n",
       "      <td>64%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008</th>\n",
       "      <td>1184</td>\n",
       "      <td>737</td>\n",
       "      <td>62%</td>\n",
       "      <td>502</td>\n",
       "      <td>42%</td>\n",
       "      <td>68%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2009</th>\n",
       "      <td>1114</td>\n",
       "      <td>732</td>\n",
       "      <td>66%</td>\n",
       "      <td>485</td>\n",
       "      <td>44%</td>\n",
       "      <td>66%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2010</th>\n",
       "      <td>1037</td>\n",
       "      <td>620</td>\n",
       "      <td>60%</td>\n",
       "      <td>458</td>\n",
       "      <td>44%</td>\n",
       "      <td>74%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2011</th>\n",
       "      <td>1048</td>\n",
       "      <td>756</td>\n",
       "      <td>72%</td>\n",
       "      <td>528</td>\n",
       "      <td>50%</td>\n",
       "      <td>70%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012</th>\n",
       "      <td>1045</td>\n",
       "      <td>637</td>\n",
       "      <td>61%</td>\n",
       "      <td>422</td>\n",
       "      <td>40%</td>\n",
       "      <td>66%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2013</th>\n",
       "      <td>1076</td>\n",
       "      <td>757</td>\n",
       "      <td>70%</td>\n",
       "      <td>460</td>\n",
       "      <td>43%</td>\n",
       "      <td>61%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014</th>\n",
       "      <td>1126</td>\n",
       "      <td>815</td>\n",
       "      <td>72%</td>\n",
       "      <td>526</td>\n",
       "      <td>47%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2015</th>\n",
       "      <td>1072</td>\n",
       "      <td>814</td>\n",
       "      <td>76%</td>\n",
       "      <td>556</td>\n",
       "      <td>52%</td>\n",
       "      <td>68%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016</th>\n",
       "      <td>1140</td>\n",
       "      <td>906</td>\n",
       "      <td>79%</td>\n",
       "      <td>589</td>\n",
       "      <td>52%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017</th>\n",
       "      <td>1119</td>\n",
       "      <td>953</td>\n",
       "      <td>85%</td>\n",
       "      <td>624</td>\n",
       "      <td>56%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2018</th>\n",
       "      <td>1112</td>\n",
       "      <td>913</td>\n",
       "      <td>82%</td>\n",
       "      <td>625</td>\n",
       "      <td>56%</td>\n",
       "      <td>68%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>973</td>\n",
       "      <td>826</td>\n",
       "      <td>85%</td>\n",
       "      <td>593</td>\n",
       "      <td>61%</td>\n",
       "      <td>72%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"13\" valign=\"top\">senate</th>\n",
       "      <th>2007</th>\n",
       "      <td>976</td>\n",
       "      <td>462</td>\n",
       "      <td>47%</td>\n",
       "      <td>290</td>\n",
       "      <td>30%</td>\n",
       "      <td>63%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2008</th>\n",
       "      <td>975</td>\n",
       "      <td>464</td>\n",
       "      <td>48%</td>\n",
       "      <td>277</td>\n",
       "      <td>28%</td>\n",
       "      <td>60%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2009</th>\n",
       "      <td>885</td>\n",
       "      <td>392</td>\n",
       "      <td>44%</td>\n",
       "      <td>255</td>\n",
       "      <td>28%</td>\n",
       "      <td>65%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2010</th>\n",
       "      <td>824</td>\n",
       "      <td>290</td>\n",
       "      <td>35%</td>\n",
       "      <td>160</td>\n",
       "      <td>19%</td>\n",
       "      <td>55%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2011</th>\n",
       "      <td>878</td>\n",
       "      <td>417</td>\n",
       "      <td>47%</td>\n",
       "      <td>239</td>\n",
       "      <td>27%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012</th>\n",
       "      <td>863</td>\n",
       "      <td>406</td>\n",
       "      <td>47%</td>\n",
       "      <td>229</td>\n",
       "      <td>27%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2013</th>\n",
       "      <td>831</td>\n",
       "      <td>477</td>\n",
       "      <td>56%</td>\n",
       "      <td>248</td>\n",
       "      <td>30%</td>\n",
       "      <td>52%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014</th>\n",
       "      <td>910</td>\n",
       "      <td>483</td>\n",
       "      <td>53%</td>\n",
       "      <td>246</td>\n",
       "      <td>27%</td>\n",
       "      <td>51%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2015</th>\n",
       "      <td>824</td>\n",
       "      <td>491</td>\n",
       "      <td>60%</td>\n",
       "      <td>279</td>\n",
       "      <td>34%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016</th>\n",
       "      <td>928</td>\n",
       "      <td>567</td>\n",
       "      <td>61%</td>\n",
       "      <td>322</td>\n",
       "      <td>35%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017</th>\n",
       "      <td>844</td>\n",
       "      <td>602</td>\n",
       "      <td>71%</td>\n",
       "      <td>337</td>\n",
       "      <td>40%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2018</th>\n",
       "      <td>823</td>\n",
       "      <td>560</td>\n",
       "      <td>68%</td>\n",
       "      <td>353</td>\n",
       "      <td>43%</td>\n",
       "      <td>63%</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>815</td>\n",
       "      <td>619</td>\n",
       "      <td>76%</td>\n",
       "      <td>351</td>\n",
       "      <td>43%</td>\n",
       "      <td>56%</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                        count_total  count_ever_held (% total)_ever_held  \\\n",
       "chamber_origin session                                                     \n",
       "house          2007            1279              813                 64%   \n",
       "               2008            1184              737                 62%   \n",
       "               2009            1114              732                 66%   \n",
       "               2010            1037              620                 60%   \n",
       "               2011            1048              756                 72%   \n",
       "               2012            1045              637                 61%   \n",
       "               2013            1076              757                 70%   \n",
       "               2014            1126              815                 72%   \n",
       "               2015            1072              814                 76%   \n",
       "               2016            1140              906                 79%   \n",
       "               2017            1119              953                 85%   \n",
       "               2018            1112              913                 82%   \n",
       "               2019             973              826                 85%   \n",
       "senate         2007             976              462                 47%   \n",
       "               2008             975              464                 48%   \n",
       "               2009             885              392                 44%   \n",
       "               2010             824              290                 35%   \n",
       "               2011             878              417                 47%   \n",
       "               2012             863              406                 47%   \n",
       "               2013             831              477                 56%   \n",
       "               2014             910              483                 53%   \n",
       "               2015             824              491                 60%   \n",
       "               2016             928              567                 61%   \n",
       "               2017             844              602                 71%   \n",
       "               2018             823              560                 68%   \n",
       "               2019             815              619                 76%   \n",
       "\n",
       "                        count_died_held (% total)_died_held (% held)_died_held  \n",
       "chamber_origin session                                                          \n",
       "house          2007                 519                 41%                64%  \n",
       "               2008                 502                 42%                68%  \n",
       "               2009                 485                 44%                66%  \n",
       "               2010                 458                 44%                74%  \n",
       "               2011                 528                 50%                70%  \n",
       "               2012                 422                 40%                66%  \n",
       "               2013                 460                 43%                61%  \n",
       "               2014                 526                 47%                65%  \n",
       "               2015                 556                 52%                68%  \n",
       "               2016                 589                 52%                65%  \n",
       "               2017                 624                 56%                65%  \n",
       "               2018                 625                 56%                68%  \n",
       "               2019                 593                 61%                72%  \n",
       "senate         2007                 290                 30%                63%  \n",
       "               2008                 277                 28%                60%  \n",
       "               2009                 255                 28%                65%  \n",
       "               2010                 160                 19%                55%  \n",
       "               2011                 239                 27%                56%  \n",
       "               2012                 229                 27%                56%  \n",
       "               2013                 248                 30%                52%  \n",
       "               2014                 246                 27%                51%  \n",
       "               2015                 279                 34%                56%  \n",
       "               2016                 322                 35%                56%  \n",
       "               2017                 337                 40%                56%  \n",
       "               2018                 353                 43%                63%  \n",
       "               2019                 351                 43%                56%  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chamber = df.drop_duplicates('lookup_id').copy()\n",
    "\n",
    "chamber.loc[chamber['bill_id'].str.contains('H'), 'chamber_origin'] = 'house'\n",
    "chamber.loc[chamber['bill_id'].str.contains('S'), 'chamber_origin'] = 'senate'\n",
    "\n",
    "held = chamber.groupby(['chamber_origin', 'session'])['lookup_id', 'ever_held_for_study', 'ended_held_for_study'].count()\n",
    "held['percent_ever_held'] = (round(held['ever_held_for_study']/held['lookup_id'], 2)*100).astype(int).astype(str) + '%'\n",
    "held['percent_ended_held'] = (round(held['ended_held_for_study']/held['lookup_id'], 2)*100).astype(int).astype(str) + '%'\n",
    "\n",
    "held['x'] = (round(held['ended_held_for_study']/held['ever_held_for_study'], 2)*100).astype(int).astype(str) + '%'\n",
    "\n",
    "held.columns = ['count_total', 'count_ever_held', 'count_died_held', \n",
    "                '(% total)_ever_held', '(% total)_died_held', '(% held)_died_held']\n",
    "\n",
    "cols = ['count_total','count_ever_held','(% total)_ever_held','count_died_held','(% total)_died_held', '(% held)_died_held']\n",
    "held = held.reindex(columns=cols)\n",
    "held"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Making Extract of Bills Held for Study"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('./data/clean/bill_actions.csv')\n",
    "\n",
    "# removes resolutions\n",
    "df = df[~df['bill_id'].str.contains('R')]\n",
    "\n",
    "# remove duplicate (created when splitting action_type)\n",
    "df = df.drop_duplicates(subset=(['action', 'lookup_id']))\n",
    "\n",
    "# adding action_num increments of bill\n",
    "df['action_num'] = df.groupby('lookup_id').cumcount()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# creating dataset for held for study tableau sheet\n",
    "df.loc[df['lookup_id'].isin(died_in_study), 'ended_held_for_study'] = True\n",
    "df.loc[df['lookup_id'].isin(held_for_study), 'ever_held_for_study'] = True\n",
    "\n",
    "df = df.fillna(False)\n",
    "\n",
    "committee = df.groupby('lookup_id')['action'].agg(['first']).reset_index()\n",
    "committee['first'] = committee['first'].apply(lambda x: x.split(', ')[-1])\n",
    "committee = committee.rename(columns={'first':'committee'})\n",
    "\n",
    "\n",
    "df = df.merge(committee, on='lookup_id')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv('./data/clean/held_for_study.csv', index=False)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}