{ "cells": [ { "cell_type": "code", "execution_count": 24, "id": "e06644f1-96f9-46ef-9ba0-23346e145442", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import statsmodels.formula.api as smf\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "fb7b16df-0987-4a58-a76c-7392c2729bea", "metadata": {}, "source": [ "# What Is Panel Data? " ] }, { "cell_type": "markdown", "id": "0a02ed5a-8026-4e01-a1a9-8f33dbf46a62", "metadata": {}, "source": [ "**Panel data** is a hybrid data type that has feature of both _cross section_ and _time series_. Actually panel data are the most common data type in industry, for instance a car manufacturer has record of its suppliers' price level over time, a bank has full history of its clients' monthly balance for many years. Needless to say, to carry out serious researches, you must use panel data.\n", "\n", "\n", "Here we will use the data from \"Why has Productivity Declined? Productivity and Public Investment\" written by Munell, A.\n", "\n", "Variable names defined as below:\n", "```\n", "STATE = state name\n", "ST_ABB = state abbreviation\n", "YR = 1970,...,1986\n", "P_CAP = public capital\n", "HWY = highway capital\n", "WATER = water utility capital\n", "UTIL = utility capital\n", "PC = private capital\n", "GSP = gross state product\n", "EMP = employment\n", "UNEMP = unemployment rate\n", "```" ] }, { "cell_type": "code", "execution_count": 8, "id": "a141ba3c-91e5-409f-bda2-65e6190db5d3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEYRP_CAPHWYWATERUTILPCGSPEMPUNEMP
0ALABAMA197015032.677325.801655.686051.2035793.80284181010.54.7
1ALABAMA197115501.947525.941721.026254.9837299.91293751021.95.2
2ALABAMA197215972.417765.421764.756442.2338670.30313031072.34.7
3ALABAMA197316406.267907.661742.416756.1940084.01334301135.53.9
4ALABAMA197416762.678025.521734.857002.2942057.31337491169.85.5
\n", "
" ], "text/plain": [ " STATE YR P_CAP HWY WATER UTIL PC GSP \\\n", "0 ALABAMA 1970 15032.67 7325.80 1655.68 6051.20 35793.80 28418 \n", "1 ALABAMA 1971 15501.94 7525.94 1721.02 6254.98 37299.91 29375 \n", "2 ALABAMA 1972 15972.41 7765.42 1764.75 6442.23 38670.30 31303 \n", "3 ALABAMA 1973 16406.26 7907.66 1742.41 6756.19 40084.01 33430 \n", "4 ALABAMA 1974 16762.67 8025.52 1734.85 7002.29 42057.31 33749 \n", "\n", " EMP UNEMP \n", "0 1010.5 4.7 \n", "1 1021.9 5.2 \n", "2 1072.3 4.7 \n", "3 1135.5 3.9 \n", "4 1169.8 5.5 " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_excel(\n", " \"Basic_Econometrics_practice_data.xlsx\", sheet_name=\"Prod_PubInvestment\"\n", ")\n", "df.head(5)" ] }, { "cell_type": "code", "execution_count": 9, "id": "c8b5605d-b401-4bc9-890a-a3f00afd772d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEYRP_CAPHWYWATERUTILPCGSPEMPUNEMP
811WYOMING19824731.983060.64408.431262.9027724.9613056217.75.8
812WYOMING19834950.823119.98445.591385.2528586.4611922202.58.4
813WYOMING19845184.733195.68476.571512.4828794.8012073204.36.3
814WYOMING19855448.383295.92523.011629.4529326.9412022206.97.1
815WYOMING19865700.413400.96565.581733.8827110.5110870196.39
\n", "
" ], "text/plain": [ " STATE YR P_CAP HWY WATER UTIL PC GSP EMP \\\n", "811 WYOMING 1982 4731.98 3060.64 408.43 1262.90 27724.96 13056 217.7 \n", "812 WYOMING 1983 4950.82 3119.98 445.59 1385.25 28586.46 11922 202.5 \n", "813 WYOMING 1984 5184.73 3195.68 476.57 1512.48 28794.80 12073 204.3 \n", "814 WYOMING 1985 5448.38 3295.92 523.01 1629.45 29326.94 12022 206.9 \n", "815 WYOMING 1986 5700.41 3400.96 565.58 1733.88 27110.51 10870 196.3 \n", "\n", " UNEMP \n", "811 5.8 \n", "812 8.4 \n", "813 6.3 \n", "814 7.1 \n", "815 9 " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.tail(5)" ] }, { "cell_type": "markdown", "id": "36072c69-8757-4984-8712-3b13221299e1", "metadata": {}, "source": [ "Each state is recorded over time in several aspects, such as public capitals, highway capital, water facility capital and etc. If each state is recorded in equal length of time period, we call it **balanced panel**, otherwise **unbalanced panel**." ] }, { "cell_type": "markdown", "id": "d0908310-5bde-42e3-aa58-137220ac88c9", "metadata": {}, "source": [ "Estimation methods includes four approaches\n", "1. Pooled OLS model\n", "2. Fixed effects least square dummy variable (LSDV) model\n", "3. Fixed effects within-in group model\n", "4. Random effects model" ] }, { "cell_type": "markdown", "id": "e2c7f836-af65-45cb-83b9-58faf37924c3", "metadata": {}, "source": [ "# Pooled OLS Regression " ] }, { "cell_type": "markdown", "id": "820001b6-e371-4c16-a6cf-667a2d030052", "metadata": {}, "source": [ "\\begin{aligned}\n", "ln{GSP}_{i t} &=\\beta_{1}+\\beta_{2} \\ln{PCAP}_{i t}+\\beta_{3} \\ln{HWY}_{i t}+\\beta_{4} \\ln{WATER}_{i t}+\\beta_{5} \\ln{UTIL}_{i t}+\\beta_{6} \\ln{EMP}_{i t}+u_{i t}\n", "\\end{aligned}\n", "where $i$ means the $i$the state, $t$ means time period." ] }, { "cell_type": "code", "execution_count": 23, "id": "b8ed122c-cc95-4b4e-a82c-701e7b3b0a64", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " OLS Regression Results \n", "==============================================================================\n", "Dep. Variable: np.log(GSP) R-squared: 0.993\n", "Model: OLS Adj. R-squared: 0.993\n", "Method: Least Squares F-statistic: 1.971e+04\n", "Date: Mon, 11 Oct 2021 Prob (F-statistic): 0.00\n", "Time: 22:39:33 Log-Likelihood: 862.28\n", "No. Observations: 816 AIC: -1711.\n", "Df Residuals: 809 BIC: -1678.\n", "Df Model: 6 \n", "Covariance Type: nonrobust \n", "=================================================================================\n", " coef std err t P>|t| [0.025 0.975]\n", "---------------------------------------------------------------------------------\n", "Intercept 1.2474 0.111 11.245 0.000 1.030 1.465\n", "np.log(P_CAP) 0.7432 0.109 6.824 0.000 0.529 0.957\n", "np.log(PC) 0.3124 0.011 28.553 0.000 0.291 0.334\n", "np.log(HWY) -0.3289 0.060 -5.520 0.000 -0.446 -0.212\n", "np.log(WATER) 0.0276 0.018 1.568 0.117 -0.007 0.062\n", "np.log(UTIL) -0.3036 0.047 -6.510 0.000 -0.395 -0.212\n", "np.log(EMP) 0.5932 0.016 36.761 0.000 0.561 0.625\n", "==============================================================================\n", "Omnibus: 17.950 Durbin-Watson: 0.200\n", "Prob(Omnibus): 0.000 Jarque-Bera (JB): 19.040\n", "Skew: 0.328 Prob(JB): 7.34e-05\n", "Kurtosis: 3.360 Cond. No. 1.23e+03\n", "==============================================================================\n", "\n", "Notes:\n", "[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n", "[2] The condition number is large, 1.23e+03. This might indicate that there are\n", "strong multicollinearity or other numerical problems.\n" ] } ], "source": [ "model = smf.ols(\n", " formula=\"np.log(GSP) ~ np.log(P_CAP) + np.log(PC) + np.log(HWY) + np.log(WATER) + np.log(UTIL) + np.log(EMP)\",\n", " data=df,\n", ")\n", "results = model.fit()\n", "print(results.summary())" ] }, { "cell_type": "markdown", "id": "4720a771-23cb-4463-9821-f577487ac310", "metadata": {}, "source": [ "The common symptoms of pooled regression on panel data is that all most of coefficients will be highly significant and also $R^2$ is exceedingly high. However, we can still spot some problems, the conditional number is high, meaning multicollinearity and Durbin-Watson test is close to $0$ meaning autocorrelation or specification error." ] }, { "cell_type": "markdown", "id": "b23077b7-72ad-4160-b4ad-85f3d2c40d9f", "metadata": {}, "source": [ "But the most prominent issue of this model is that it camouflages the heterogeneity that may exist among states. The heterogeneity of each state is subsumed by the disturbance term, which causes correlation between independent variables and disturbance terms, therefore OLS estimates are bound to be biased and inconsistent." ] }, { "cell_type": "markdown", "id": "78c169eb-c41a-49be-845b-4ffc83eb55dc", "metadata": {}, "source": [ "# The Fixed Effect LSDV Model" ] }, { "cell_type": "markdown", "id": "3f03cb2e-1c90-448a-af78-e711b7930963", "metadata": {}, "source": [ "LSDV model allows heterogeneity to take part in by adding different intercept value" ] }, { "cell_type": "markdown", "id": "6e607ccd-5ae9-41ec-ae57-aef27cfd7c4e", "metadata": {}, "source": [ "\\begin{aligned}\n", "ln{GSP}_{i t} &=\\beta_{1i}+\\beta_{2} \\ln{PCAP}_{i t}+\\beta_{3} \\ln{HWY}_{i t}+\\beta_{4} \\ln{WATER}_{i t}+\\beta_{5} \\ln{UTIL}_{i t}+\\beta_{6} \\ln{EMP}_{i t}+u_{i t}\n", "\\end{aligned}" ] }, { "cell_type": "markdown", "id": "0b086044-fe83-42c4-b469-7badf3b491fd", "metadata": {}, "source": [ "$\\beta_{1i}$ represents the intercept for each state $i$. There are various possible reasons contributing to heterogeneity among states, such as population, average education level and urbanization rate, etc.\n", "\n", "_Fixed effect_ means that though each state has its own intercept, but it is **time-invariant**, i.e. constant over the time. If we assume **time-variant** intercept, the notation would be $\\beta_{1it}$" ] }, { "cell_type": "code", "execution_count": 25, "id": "effe1439-8a83-4f12-8a92-ab1b40df9b51", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
STATEYRP_CAPHWYWATERUTILPCGSPEMPUNEMP
0ALABAMA197015032.677325.801655.686051.2035793.80284181010.54.7
1ALABAMA197115501.947525.941721.026254.9837299.91293751021.95.2
2ALABAMA197215972.417765.421764.756442.2338670.30313031072.34.7
3ALABAMA197316406.267907.661742.416756.1940084.01334301135.53.9
4ALABAMA197416762.678025.521734.857002.2942057.31337491169.85.5
.................................
811WYOMING19824731.983060.64408.431262.9027724.9613056217.75.8
812WYOMING19834950.823119.98445.591385.2528586.4611922202.58.4
813WYOMING19845184.733195.68476.571512.4828794.8012073204.36.3
814WYOMING19855448.383295.92523.011629.4529326.9412022206.97.1
815WYOMING19865700.413400.96565.581733.8827110.5110870196.39
\n", "

816 rows × 10 columns

\n", "
" ], "text/plain": [ " STATE YR P_CAP HWY WATER UTIL PC GSP \\\n", "0 ALABAMA 1970 15032.67 7325.80 1655.68 6051.20 35793.80 28418 \n", "1 ALABAMA 1971 15501.94 7525.94 1721.02 6254.98 37299.91 29375 \n", "2 ALABAMA 1972 15972.41 7765.42 1764.75 6442.23 38670.30 31303 \n", "3 ALABAMA 1973 16406.26 7907.66 1742.41 6756.19 40084.01 33430 \n", "4 ALABAMA 1974 16762.67 8025.52 1734.85 7002.29 42057.31 33749 \n", ".. ... ... ... ... ... ... ... ... \n", "811 WYOMING 1982 4731.98 3060.64 408.43 1262.90 27724.96 13056 \n", "812 WYOMING 1983 4950.82 3119.98 445.59 1385.25 28586.46 11922 \n", "813 WYOMING 1984 5184.73 3195.68 476.57 1512.48 28794.80 12073 \n", "814 WYOMING 1985 5448.38 3295.92 523.01 1629.45 29326.94 12022 \n", "815 WYOMING 1986 5700.41 3400.96 565.58 1733.88 27110.51 10870 \n", "\n", " EMP UNEMP \n", "0 1010.5 4.7 \n", "1 1021.9 5.2 \n", "2 1072.3 4.7 \n", "3 1135.5 3.9 \n", "4 1169.8 5.5 \n", ".. ... ... \n", "811 217.7 5.8 \n", "812 202.5 8.4 \n", "813 204.3 6.3 \n", "814 206.9 7.1 \n", "815 196.3 9 \n", "\n", "[816 rows x 10 columns]" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 44, "id": "83cbc246-eb70-4542-a51b-fe4ab08453a0", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(nrows=2, ncols=3, figsize=(18, 12))\n", "ax[0, 0].scatter(df[\"GSP\"], df[\"P_CAP\"], c=\"r\", s=5)\n", "ax[0, 0].grid()\n", "ax[0, 0].set_xlabel(\"Public Capital\")\n", "ax[0, 0].set_ylabel(\"Gross Regional Produce\")\n", "\n", "ax[0, 1].scatter(df[\"GSP\"], df[\"HWY\"], c=\"r\", s=5)\n", "ax[0, 1].grid()\n", "ax[0, 1].set_xlabel(\"High Way Capital\")\n", "ax[0, 1].set_ylabel(\"Gross Regional Produce\")\n", "\n", "ax[0, 2].scatter(df[\"GSP\"], df[\"WATER\"], c=\"r\", s=5)\n", "ax[0, 2].grid()\n", "ax[0, 2].set_xlabel(\"Water Facility\")\n", "ax[0, 2].set_ylabel(\"Gross Regional Produce\")\n", "\n", "ax[1, 0].scatter(df[\"GSP\"], df[\"UTIL\"], c=\"r\", s=5)\n", "ax[1, 0].grid()\n", "ax[1, 0].set_xlabel(\"Utiltiy Capital\")\n", "ax[1, 0].set_ylabel(\"Gross Regional Produce\")\n", "\n", "ax[1, 1].scatter(df[\"GSP\"], df[\"PC\"], c=\"r\", s=5)\n", "ax[1, 1].grid()\n", "ax[1, 1].set_xlabel(\"Private Capital\")\n", "ax[1, 1].set_ylabel(\"Gross Regional Produce\")\n", "\n", "ax[1, 2].scatter(df[\"GSP\"], df[\"EMP\"], c=\"r\", s=5)\n", "ax[1, 2].grid()\n", "ax[1, 2].set_xlabel(\"Employement\")\n", "ax[1, 2].set_ylabel(\"Gross Regional Produce\")\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "id": "40ad195a-56c6-4654-9b74-8d2c47765e91", "metadata": {}, "source": [ "Check how many states are there in the panel data" ] }, { "cell_type": "code", "execution_count": 59, "id": "17d43c76-2cf1-45ce-b336-b9a5186dac39", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['ALABAMA' 'ARIZONA' 'ARKANSAS' 'CALIFORNIA' 'COLORADO' 'CONNECTICUT'\n", " 'DELAWARE' 'FLORIDA' 'GEORGIA' 'IDAHO' 'ILLINOIS' 'INDIANA' 'IOWA'\n", " 'KANSAS' 'KENTUCKY' 'LOUISIANA' 'MAINE' 'MARYLAND' 'MASSACHUSETTS'\n", " 'MICHIGAN' 'MINNESOTA' 'MISSISSIPPI' 'MISSOURI' 'MONTANA' 'NEBRASKA'\n", " 'NEVADA' 'NEW_HAMPSHIRE' 'NEW_JERSEY' 'NEW_MEXICO' 'NEW_YORK'\n", " 'NORTH_CAROLINA' 'NORTH_DAKOTA' 'OHIO' 'OKLAHOMA' 'OREGON' 'PENNSYLVANIA'\n", " 'RHODE_ISLAND' 'SOUTH_CAROLINA' 'SOUTH_DAKOTA' 'TENNESSE' 'TEXAS' 'UTAH'\n", " 'VERMONT' 'VIRGINIA' 'WASHINGTON' 'WEST_VIRGINIA' 'WISCONSIN' 'WYOMING']\n", "48\n" ] } ], "source": [ "print(df[\"STATE\"].unique())\n", "print(len(df[\"STATE\"].unique()))" ] }, { "cell_type": "markdown", "id": "7c36c7f0-ae11-468b-a504-013dee943165", "metadata": {}, "source": [ "To avoid dummy variable trap, we can define $47$ dummy intercepts." ] }, { "cell_type": "markdown", "id": "460c162d-8930-43c2-a66a-52309fa27c35", "metadata": {}, "source": [ "Add dummies onto the intercept\n", "\n", "\\begin{aligned}\n", "ln{GSP}_{i t} &=\\alpha_{1}+ \\sum_{j=2}^{48}\\alpha_{j} D_{j i}+\\beta_{2} \\ln{PCAP}_{i t}+\\beta_{3} \\ln{HWY}_{i t}+\\beta_{4} \\ln{WATER}_{i t}+\\beta_{5} \\ln{UTIL}_{i t}+\\beta_{6} \\ln{EMP}_{i t}+u_{i t}\n", "\\end{aligned}\n" ] }, { "cell_type": "markdown", "id": "6746e1b1-476a-466c-a615-45dc2fa832df", "metadata": {}, "source": [ "Use ```STATE``` as the dummy column and add ```drop_fist``` to avoid dummy trap." ] }, { "cell_type": "code", "execution_count": 62, "id": "5673cd51-ca67-4bb0-8e9f-99d8fa725deb", "metadata": { "tags": [] }, "outputs": [], "source": [ "df_dum = pd.get_dummies(data=df, columns=[\"STATE\"], drop_first=True)" ] }, { "cell_type": "code", "execution_count": 72, "id": "a647cc53-ab42-4e03-961b-2eba9e3864e8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YRP_CAPHWYWATERUTILPCGSPEMPUNEMPSTATE_ARIZONA...STATE_SOUTH_DAKOTASTATE_TENNESSESTATE_TEXASSTATE_UTAHSTATE_VERMONTSTATE_VIRGINIASTATE_WASHINGTONSTATE_WEST_VIRGINIASTATE_WISCONSINSTATE_WYOMING
0197015032.677325.801655.686051.2035793.80284181010.54.70...0000000000
1197115501.947525.941721.026254.9837299.91293751021.95.20...0000000000
2197215972.417765.421764.756442.2338670.30313031072.34.70...0000000000
3197316406.267907.661742.416756.1940084.01334301135.53.90...0000000000
4197416762.678025.521734.857002.2942057.31337491169.85.50...0000000000
..................................................................
81119824731.983060.64408.431262.9027724.9613056217.75.80...0000000001
81219834950.823119.98445.591385.2528586.4611922202.58.40...0000000001
81319845184.733195.68476.571512.4828794.8012073204.36.30...0000000001
81419855448.383295.92523.011629.4529326.9412022206.97.10...0000000001
81519865700.413400.96565.581733.8827110.5110870196.390...0000000001
\n", "

816 rows × 56 columns

\n", "
" ], "text/plain": [ " YR P_CAP HWY WATER UTIL PC GSP EMP UNEMP \\\n", "0 1970 15032.67 7325.80 1655.68 6051.20 35793.80 28418 1010.5 4.7 \n", "1 1971 15501.94 7525.94 1721.02 6254.98 37299.91 29375 1021.9 5.2 \n", "2 1972 15972.41 7765.42 1764.75 6442.23 38670.30 31303 1072.3 4.7 \n", "3 1973 16406.26 7907.66 1742.41 6756.19 40084.01 33430 1135.5 3.9 \n", "4 1974 16762.67 8025.52 1734.85 7002.29 42057.31 33749 1169.8 5.5 \n", ".. ... ... ... ... ... ... ... ... ... \n", "811 1982 4731.98 3060.64 408.43 1262.90 27724.96 13056 217.7 5.8 \n", "812 1983 4950.82 3119.98 445.59 1385.25 28586.46 11922 202.5 8.4 \n", "813 1984 5184.73 3195.68 476.57 1512.48 28794.80 12073 204.3 6.3 \n", "814 1985 5448.38 3295.92 523.01 1629.45 29326.94 12022 206.9 7.1 \n", "815 1986 5700.41 3400.96 565.58 1733.88 27110.51 10870 196.3 9 \n", "\n", " STATE_ARIZONA ... STATE_SOUTH_DAKOTA STATE_TENNESSE STATE_TEXAS \\\n", "0 0 ... 0 0 0 \n", "1 0 ... 0 0 0 \n", "2 0 ... 0 0 0 \n", "3 0 ... 0 0 0 \n", "4 0 ... 0 0 0 \n", ".. ... ... ... ... ... \n", "811 0 ... 0 0 0 \n", "812 0 ... 0 0 0 \n", "813 0 ... 0 0 0 \n", "814 0 ... 0 0 0 \n", "815 0 ... 0 0 0 \n", "\n", " STATE_UTAH STATE_VERMONT STATE_VIRGINIA STATE_WASHINGTON \\\n", "0 0 0 0 0 \n", "1 0 0 0 0 \n", "2 0 0 0 0 \n", "3 0 0 0 0 \n", "4 0 0 0 0 \n", ".. ... ... ... ... \n", "811 0 0 0 0 \n", "812 0 0 0 0 \n", "813 0 0 0 0 \n", "814 0 0 0 0 \n", "815 0 0 0 0 \n", "\n", " STATE_WEST_VIRGINIA STATE_WISCONSIN STATE_WYOMING \n", "0 0 0 0 \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", ".. ... ... ... \n", "811 0 0 1 \n", "812 0 0 1 \n", "813 0 0 1 \n", "814 0 0 1 \n", "815 0 0 1 \n", "\n", "[816 rows x 56 columns]" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_dum" ] }, { "cell_type": "code", "execution_count": null, "id": "88526b36-1894-4fb6-810b-b36cbb439dec", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 5 }