{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Hypothesis testing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hypothesis: University towns and non-university towns have same effect on their mean housing prices during recessions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data sources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1) List of university towns from wikipedia. This is a text file with 567 lines either containing state name or the name of city and the university in the city.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2) Housing data from Zillow containg information about house sale prices in each month from the year 1996 to 2016. A snapshot of the csv file is shown here.\n", "\n", "" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "3) Quarterly GDP data from US Department of Commerce, Bureau of Economic Analysis. A snapshot of the excel file is shown below.\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "from scipy.stats import ttest_ind\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from scipy.stats import norm" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StateRegionName
0AlabamaAuburn
1AlabamaFlorence
2AlabamaJacksonville
3AlabamaLivingston
4AlabamaMontevallo
.........
512WisconsinRiver Falls
513WisconsinStevens Point
514WisconsinWaukesha
515WisconsinWhitewater
516WyomingLaramie
\n", "

517 rows × 2 columns

\n", "
" ], "text/plain": [ " State RegionName\n", "0 Alabama Auburn\n", "1 Alabama Florence\n", "2 Alabama Jacksonville\n", "3 Alabama Livingston\n", "4 Alabama Montevallo\n", ".. ... ...\n", "512 Wisconsin River Falls\n", "513 Wisconsin Stevens Point\n", "514 Wisconsin Waukesha\n", "515 Wisconsin Whitewater\n", "516 Wyoming Laramie\n", "\n", "[517 rows x 2 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def get_list_of_university_towns():\n", " '''Returns a DataFrame of towns and the states they are in from the \n", " university_towns.txt list. \n", " \n", " Following steps were taken to extract the required information in the desired format:\n", "\n", " 1. For \"State\", removing characters from \"[\" to the end.\n", " 2. For \"RegionName\", when applicable, removing every character from \" (\" to the end.'''\n", " uni_towns=pd.DataFrame(columns=['State','RegionName'])\n", " with open('university_towns.txt', 'r') as f:\n", "\n", " #Read non-empty lines:\n", " data = (line.rstrip() for line in f)\n", " lines = list(line for line in data if line)\n", "\n", " #Get the index of states:\n", " r_idx = [lines.index(line) for line in lines if '[edit]' in line]\n", "\n", " #Separating states and university names using wrapping indexes:\n", " university = []\n", " state = [lines[i].replace('[edit]', '') for i in r_idx]\n", " for i in range(len(r_idx)):\n", " if i != len(r_idx)-1:\n", " index_range=[j for j in range(r_idx[i]+1,r_idx[i+1])]\n", " regions=[lines[j].split(\" (\")[0] for j in index_range]\n", " for region in regions:\n", " uni_towns=uni_towns.append(pd.DataFrame([[state[i],region]],columns=['State','RegionName']),ignore_index=True)\n", " else:\n", " index_range=[j for j in range(r_idx[i]+1,len(lines))]\n", " regions = [lines[j].split(\" (\")[0] for j in index_range]\n", " for region in regions:\n", " uni_towns=uni_towns.append(pd.DataFrame([[state[i],region]],columns=['State','RegionName']),ignore_index=True)\n", " return uni_towns\n", "get_list_of_university_towns()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Get the quarter when recession starts" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recession start is defined by two consecutive quarters of decline in GDP." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2008q3'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def get_recession_start():\n", " '''Returns the year and quarter of the recession start time after year 2000.'''\n", " gdp=pd.read_excel('gdplev.xls',usecols=\"E,G\",skiprows=219,names=['Quarter','GDP']).set_index('Quarter')\n", " gdp['GDP change']=gdp['GDP']-gdp['GDP'].shift(+1)\n", " gdp['RecessionStart'] = (gdp['GDP change'] < 0) & (gdp['GDP change'].shift(-1) < 0)& (gdp['GDP change'].shift(+1) > 0)\n", " return gdp.index[gdp['RecessionStart']][0]\n", "get_recession_start()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Get recession end quarter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recession end is defined by two consecutive quarters with GDP increase." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2009q4'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def get_recession_end():\n", " '''Returns the year and quarter of the recession end time'''\n", " rstart=get_recession_start()\n", " gdp=pd.read_excel('gdplev.xls',usecols=\"E,G\",skiprows=219,names=['Quarter','GDP']).set_index('Quarter')\n", " gdp['GDP change']=gdp['GDP']-gdp['GDP'].shift(+1)\n", " gdp['RecessionEnd'] = (gdp['GDP change'] > 0) & (gdp['GDP change'].shift(+1) > 0)&(gdp['GDP change'].shift(+2) < 0)\n", " #return gdp\n", " list_of_year=list(gdp.index[gdp['RecessionEnd']])\n", " for year in list_of_year:\n", " if(year>rstart):\n", " rend=year\n", " break\n", " #rend=[year in list_of_year if(year > rstart)]\n", " return rend\n", "get_recession_end()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Get recession bottom" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recession bottom is defined as the quarter with minimum GDP between the start and end of recession" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/shivangi/anaconda2/envs/python3/lib/python3.7/site-packages/numpy/core/fromnumeric.py:61: FutureWarning: \n", "The current behaviour of 'Series.argmin' is deprecated, use 'idxmin'\n", "instead.\n", "The behavior of 'argmin' will be corrected to return the positional\n", "minimum in the future. For now, use 'series.values.argmin' or\n", "'np.argmin(np.array(values))' to get the position of the minimum\n", "row.\n", " return bound(*args, **kwds)\n" ] }, { "data": { "text/plain": [ "'2009q2'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def get_recession_bottom():\n", " '''Returns the year and quarter of the recession bottom time'''\n", " rstart=get_recession_start()\n", " rend=get_recession_end()\n", " gdp=pd.read_excel('gdplev.xls',usecols=\"E,G\",skiprows=219,names=['Quarter','GDP']).set_index('Quarter')\n", " rbottom=np.argmin(gdp.loc[rstart:rend,'GDP'])\n", " return rbottom\n", "get_recession_bottom()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Converting the monthly housing prices to quarterly prices" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Use this dictionary to map state names to two letter acronyms\n", "states = {'OH': 'Ohio', 'KY': 'Kentucky', 'AS': 'American Samoa', 'NV': 'Nevada', 'WY': 'Wyoming', 'NA': 'National', 'AL': 'Alabama', 'MD': 'Maryland', 'AK': 'Alaska', 'UT': 'Utah', 'OR': 'Oregon', 'MT': 'Montana', 'IL': 'Illinois', 'TN': 'Tennessee', 'DC': 'District of Columbia', 'VT': 'Vermont', 'ID': 'Idaho', 'AR': 'Arkansas', 'ME': 'Maine', 'WA': 'Washington', 'HI': 'Hawaii', 'WI': 'Wisconsin', 'MI': 'Michigan', 'IN': 'Indiana', 'NJ': 'New Jersey', 'AZ': 'Arizona', 'GU': 'Guam', 'MS': 'Mississippi', 'PR': 'Puerto Rico', 'NC': 'North Carolina', 'TX': 'Texas', 'SD': 'South Dakota', 'MP': 'Northern Mariana Islands', 'IA': 'Iowa', 'MO': 'Missouri', 'CT': 'Connecticut', 'WV': 'West Virginia', 'SC': 'South Carolina', 'LA': 'Louisiana', 'KS': 'Kansas', 'NY': 'New York', 'NE': 'Nebraska', 'OK': 'Oklahoma', 'FL': 'Florida', 'CA': 'California', 'CO': 'Colorado', 'PA': 'Pennsylvania', 'DE': 'Delaware', 'NM': 'New Mexico', 'RI': 'Rhode Island', 'MN': 'Minnesota', 'VI': 'Virgin Islands', 'NH': 'New Hampshire', 'MA': 'Massachusetts', 'GA': 'Georgia', 'ND': 'North Dakota', 'VA': 'Virginia'}" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
yearQ2000q12000q22000q32000q42001q12001q22001q32001q42002q12002q2...2014q22014q32014q42015q12015q22015q32015q42016q12016q22016q3
StateRegionName
AlabamaAdamsville69033.33333369166.66666769800.00000071966.66666773466.66666774000.00000073333.33333373100.00000073333.33333373133.333333...77066.66666775966.66666771900.00000071666.66666773033.33333373933.33333373866.66666774166.66666774933.33333374700.0
Alabaster122133.333333123066.666667123166.666667123700.000000123233.333333125133.333333127766.666667127200.000000127300.000000128000.000000...147133.333333147633.333333148700.000000148900.000000149566.666667150366.666667151733.333333153466.666667155100.000000155850.0
Albertville73966.66666772600.00000072833.33333374200.00000075900.00000076000.00000072066.66666773566.66666776533.33333376366.666667...84033.33333384766.66666786800.00000088466.66666789500.00000090233.33333391366.66666792000.00000092466.66666792200.0
Arab83766.66666781566.66666781333.33333382966.66666784200.00000084533.33333381666.66666783900.00000087266.66666787700.000000...113366.666667111700.000000111600.000000110166.666667109433.333333110900.000000112233.333333110033.333333110100.000000112000.0
ArdmoreNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...140533.333333139566.666667140900.000000143233.333333143000.000000144600.000000143966.666667142566.666667143233.333333141950.0
.....................................................................
WyomingBurns101533.333333104566.666667108366.666667113000.000000115833.333333117200.000000117800.000000117633.333333117333.333333117233.333333...168866.666667161933.333333160433.333333162600.000000163066.666667164600.000000164300.000000168266.666667171600.000000170500.0
Casper89233.33333389600.00000089733.33333393166.66666795500.00000097633.33333399433.333333100633.333333101733.333333101533.333333...175766.666667177300.000000181000.000000182066.666667182633.333333183300.000000182700.000000184333.333333185166.666667184350.0
Cheyenne116866.666667120033.333333121533.333333123633.333333125533.333333126300.000000126466.666667128133.333333128466.666667129633.333333...177466.666667176733.333333178766.666667181700.000000183266.666667186766.666667190666.666667194433.333333196500.000000199100.0
Evansville128033.333333128766.666667130833.333333132066.666667130566.666667131433.333333132400.000000133466.666667133300.000000131066.666667...296733.333333305666.666667309500.000000307300.000000303166.666667300966.666667304200.000000309433.333333309200.000000309050.0
Pine Bluffs93733.33333395066.66666794633.33333398066.666667103233.333333104600.000000106500.000000104066.666667102233.333333103566.666667...148666.666667154366.666667158100.000000163900.000000167433.333333167466.666667169200.000000166833.333333163800.000000157650.0
\n", "

10730 rows × 67 columns

\n", "
" ], "text/plain": [ "yearQ 2000q1 2000q2 2000q3 \\\n", "State RegionName \n", "Alabama Adamsville 69033.333333 69166.666667 69800.000000 \n", " Alabaster 122133.333333 123066.666667 123166.666667 \n", " Albertville 73966.666667 72600.000000 72833.333333 \n", " Arab 83766.666667 81566.666667 81333.333333 \n", " Ardmore NaN NaN NaN \n", "... ... ... ... \n", "Wyoming Burns 101533.333333 104566.666667 108366.666667 \n", " Casper 89233.333333 89600.000000 89733.333333 \n", " Cheyenne 116866.666667 120033.333333 121533.333333 \n", " Evansville 128033.333333 128766.666667 130833.333333 \n", " Pine Bluffs 93733.333333 95066.666667 94633.333333 \n", "\n", "yearQ 2000q4 2001q1 2001q2 \\\n", "State RegionName \n", "Alabama Adamsville 71966.666667 73466.666667 74000.000000 \n", " Alabaster 123700.000000 123233.333333 125133.333333 \n", " Albertville 74200.000000 75900.000000 76000.000000 \n", " Arab 82966.666667 84200.000000 84533.333333 \n", " Ardmore NaN NaN NaN \n", "... ... ... ... \n", "Wyoming Burns 113000.000000 115833.333333 117200.000000 \n", " Casper 93166.666667 95500.000000 97633.333333 \n", " Cheyenne 123633.333333 125533.333333 126300.000000 \n", " Evansville 132066.666667 130566.666667 131433.333333 \n", " Pine Bluffs 98066.666667 103233.333333 104600.000000 \n", "\n", "yearQ 2001q3 2001q4 2002q1 \\\n", "State RegionName \n", "Alabama Adamsville 73333.333333 73100.000000 73333.333333 \n", " Alabaster 127766.666667 127200.000000 127300.000000 \n", " Albertville 72066.666667 73566.666667 76533.333333 \n", " Arab 81666.666667 83900.000000 87266.666667 \n", " Ardmore NaN NaN NaN \n", "... ... ... ... \n", "Wyoming Burns 117800.000000 117633.333333 117333.333333 \n", " Casper 99433.333333 100633.333333 101733.333333 \n", " Cheyenne 126466.666667 128133.333333 128466.666667 \n", " Evansville 132400.000000 133466.666667 133300.000000 \n", " Pine Bluffs 106500.000000 104066.666667 102233.333333 \n", "\n", "yearQ 2002q2 ... 2014q2 2014q3 \\\n", "State RegionName ... \n", "Alabama Adamsville 73133.333333 ... 77066.666667 75966.666667 \n", " Alabaster 128000.000000 ... 147133.333333 147633.333333 \n", " Albertville 76366.666667 ... 84033.333333 84766.666667 \n", " Arab 87700.000000 ... 113366.666667 111700.000000 \n", " Ardmore NaN ... 140533.333333 139566.666667 \n", "... ... ... ... ... \n", "Wyoming Burns 117233.333333 ... 168866.666667 161933.333333 \n", " Casper 101533.333333 ... 175766.666667 177300.000000 \n", " Cheyenne 129633.333333 ... 177466.666667 176733.333333 \n", " Evansville 131066.666667 ... 296733.333333 305666.666667 \n", " Pine Bluffs 103566.666667 ... 148666.666667 154366.666667 \n", "\n", "yearQ 2014q4 2015q1 2015q2 \\\n", "State RegionName \n", "Alabama Adamsville 71900.000000 71666.666667 73033.333333 \n", " Alabaster 148700.000000 148900.000000 149566.666667 \n", " Albertville 86800.000000 88466.666667 89500.000000 \n", " Arab 111600.000000 110166.666667 109433.333333 \n", " Ardmore 140900.000000 143233.333333 143000.000000 \n", "... ... ... ... \n", "Wyoming Burns 160433.333333 162600.000000 163066.666667 \n", " Casper 181000.000000 182066.666667 182633.333333 \n", " Cheyenne 178766.666667 181700.000000 183266.666667 \n", " Evansville 309500.000000 307300.000000 303166.666667 \n", " Pine Bluffs 158100.000000 163900.000000 167433.333333 \n", "\n", "yearQ 2015q3 2015q4 2016q1 \\\n", "State RegionName \n", "Alabama Adamsville 73933.333333 73866.666667 74166.666667 \n", " Alabaster 150366.666667 151733.333333 153466.666667 \n", " Albertville 90233.333333 91366.666667 92000.000000 \n", " Arab 110900.000000 112233.333333 110033.333333 \n", " Ardmore 144600.000000 143966.666667 142566.666667 \n", "... ... ... ... \n", "Wyoming Burns 164600.000000 164300.000000 168266.666667 \n", " Casper 183300.000000 182700.000000 184333.333333 \n", " Cheyenne 186766.666667 190666.666667 194433.333333 \n", " Evansville 300966.666667 304200.000000 309433.333333 \n", " Pine Bluffs 167466.666667 169200.000000 166833.333333 \n", "\n", "yearQ 2016q2 2016q3 \n", "State RegionName \n", "Alabama Adamsville 74933.333333 74700.0 \n", " Alabaster 155100.000000 155850.0 \n", " Albertville 92466.666667 92200.0 \n", " Arab 110100.000000 112000.0 \n", " Ardmore 143233.333333 141950.0 \n", "... ... ... \n", "Wyoming Burns 171600.000000 170500.0 \n", " Casper 185166.666667 184350.0 \n", " Cheyenne 196500.000000 199100.0 \n", " Evansville 309200.000000 309050.0 \n", " Pine Bluffs 163800.000000 157650.0 \n", "\n", "[10730 rows x 67 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def value2quarter(value):\n", " year, month = value.split(\"-\")\n", " year = int(year)\n", " month = int(month)\n", " if month < 4:\n", " return \"{}q{}\".format(year, 1)\n", " if month < 7:\n", " return \"{}q{}\".format(year, 2)\n", " if month < 10:\n", " return \"{}q{}\".format(year, 3)\n", " return \"{}q{}\".format(year, 4)\n", "\n", "def convert_housing_data_to_quarters():\n", " df = pd.read_csv(\"City_Zhvi_AllHomes.csv\")\n", " df_t= pd.melt(df.drop([\"Metro\", \"CountyName\", \"SizeRank\"], axis=1), id_vars=[\"State\", \"RegionName\", \"RegionID\",])\n", " df_t[\"yearQ\"] = df_t[\"variable\"].apply(value2quarter)\n", " df_t = df_t[(df_t.yearQ >= \"2000q1\") & (df_t.yearQ <= \"2016q3\")]\n", " df_t[\"State\"] = df_t[\"State\"].map(states)\n", " answer = df_t.groupby(\n", " [\"State\", \"RegionName\", \"RegionID\", \"yearQ\"]\n", " )[\"value\"].mean().unstack(\"yearQ\").reset_index().drop(\"RegionID\", axis=1).set_index([\"State\", \"RegionName\"])\n", " return answer\n", "convert_housing_data_to_quarters()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exploring the effect of recession on housing prices in university and non-university towns" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "df_university = get_list_of_university_towns()\n", "df_university[\"is_univ_town\"] = True\n", "df_university = df_university.set_index([\"State\", \"RegionName\"])\n", "df_housing_data = convert_housing_data_to_quarters()\n", "df_all = df_housing_data.merge(df_university, left_index=True, right_index=True, how=\"left\")\n", "df_all[\"is_univ_town\"] = df_all[\"is_univ_town\"].fillna(False)\n", "start = \"2008q3\"# get_recession_start()\n", "bottom = \"2009q2\"# get_recession_bottom()\n", "df_all[\"ratio\"] = df_all[bottom]/df_all[start]\n", "mean_price_ratio_univ = df_all.loc[df_all.is_univ_town, \"ratio\"].dropna()\n", "mean_price_ratio_no_univ = df_all.loc[~df_all.is_univ_town, \"ratio\"].dropna()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(0.967891003753308, 0.06435067670521108)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAMT0lEQVR4nO3dbazk9VmH8esrW4JAa7F7qhValmpDJaam5KjYmlbBJoBGWmkiRJBWzCbG1rYpUfpGNhpjXxi1Jj5kReyDDSQiRuJzQyHElJIeHorQpS3WSrege2qV+hQBe/viTONy2D0zO///OTP3cn2Szc7Mmdm5f9nd6/zm4T8nVYUkqZ+vW/QAkqT5GHBJasqAS1JTBlySmjLgktTUrp28s927d9eePXt28i4lqb177rnnS1W1svnyHQ34nj17WFtb28m7lKT2kvzjkS73KRRJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqakePxJSm2bdvMbeVOnIHLklNGXBJasqAS1JTUwOe5IYkh5I8eNhl35jkI0k+O/n9tO0dU5K02Sw78PcDF2667Frgtqp6BXDb5LwkaQdNDXhV3Ql8edPFlwAfmJz+APDGkeeSJE0x73Pg31RVjwNMfn/x0a6YZG+StSRr6+vrc96dJGmzbX8Rs6r2V9VqVa2urDzrJwJJkuY0b8D/OclLACa/HxpvJEnSLOYN+K3AVZPTVwF/Os44kqRZzfI2whuBu4CzkxxMcjXwXuANST4LvGFyXpK0g6Z+FkpVXX6UL10w8iySpGPgkZiS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWpq16IH0PFn375FTyA9NwzagSd5V5KHkjyY5MYkJ401mCRpa3MHPMnpwM8Cq1X1HcAJwGVjDSZJ2trQ58B3AV+fZBdwMvDY8JEkSbOYO+BV9UXgV4FHgceBJ6rqbzZfL8neJGtJ1tbX1+efVJL0DEOeQjkNuAQ4C/gW4JQkV2y+XlXtr6rVqlpdWVmZf1JJ0jMMeQrlB4F/qKr1qnoKuAV4zThjSZKmGRLwR4HzkpycJMAFwIFxxpIkTTPkOfC7gZuBe4G/m/xZ+0eaS5I0xaADearqOuC6kWaRJB0DD6WXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmhoU8CQvTHJzkoeTHEjyvWMNJkna2q6Bt38f8FdV9eYkJwInjzCTJGkGcwc8yQuA1wFvAaiqJ4EnxxlLkjTNkKdQXg6sA3+Q5L4k1yc5ZfOVkuxNspZkbX19fcDdSZIONyTgu4Bzgd+pqlcD/wlcu/lKVbW/qlaranVlZWXA3UmSDjck4AeBg1V19+T8zWwEXZK0A+YOeFX9E/CFJGdPLroA+NQoU0mSphr6LpS3Ax+evAPlc8Bbh48kSZrFoIBX1f3A6kizSJKOgUdiSlJTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqaldix5Ay2ffvkVPIGkW7sAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDU1OOBJTkhyX5I/G2MgSdJsxtiBvwM4MMKfI0k6BoMCnuQM4IeA68cZR5I0q6E78N8Afg746tGukGRvkrUka+vr6wPvTpL0NXMHPMkPA4eq6p6trldV+6tqtapWV1ZW5r07SdImQ3bgrwV+JMnngZuA85P84ShTSZKmmjvgVfWeqjqjqvYAlwEfraorRptMkrQl3wcuSU2N8gMdquoO4I4x/ixJ0mzcgUtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasqAS1JTBlySmjLgktSUAZekpgy4JDVlwCWpKQMuSU2N8nng0jLYt28xt5UWxR24JDVlwCWpKQMuSU0ZcElqyoBLUlMGXJKaMuCS1JQBl6SmDLgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLU1NwBT/LSJLcnOZDkoSTvGHMwSdLWhvxAh6eBd1fVvUmeD9yT5CNV9amRZpMkbWHuHXhVPV5V905O/ztwADh9rMEkSVsb5TnwJHuAVwN3H+Fre5OsJVlbX18f4+4kSYwQ8CSnAn8MvLOqvrL561W1v6pWq2p1ZWVl6N1JkiYGBTzJ89iI94er6pZxRpIkzWLIu1AC/D5woKp+bbyRJEmzGLIDfy1wJXB+kvsnvy4eaS5J0hRzv42wqv4WyIizSJKOgUdiSlJTQw7kkY4b+/Yt5rbSEO7AJakpd+DHKXeF0vHPHbgkNWXAJakpAy5JTRlwSWrKgEtSUwZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklNGXBJasoPs1pifiCVpK24A5ekptyBSwMNfaTkIy3Nyx24JDVlwCWpKQMuSU0ZcElqyoBLUlO+C2Wb+Q4DSdvFHbgkNeUOXGpsyCM8Hx325w5ckpoy4JLUVJunUHyoqOPVc/Hfp/+fx+EOXJKaGrQDT3Ih8D7gBOD6qnrvKFONzO/YkoZaxkcNc+/Ak5wA/BZwEXAOcHmSc8YaTJK0tSFPoXw38EhVfa6qngRuAi4ZZyxJ0jSpqvlumLwZuLCqfmpy/krge6rqbZuutxfYOzl7NvBpYDfwpXmHXiKuY/kcL2txHctnkWs5s6pWNl845DnwHOGyZ303qKr9wP5n3DBZq6rVAfe9FFzH8jle1uI6ls8yrmXIUygHgZcedv4M4LFh40iSZjUk4J8AXpHkrCQnApcBt44zliRpmrmfQqmqp5O8DfhrNt5GeENVPTTjzfdPv0oLrmP5HC9rcR3LZ+nWMveLmJKkxfJITElqyoBLUlPbFvAkFyb5dJJHklx7hK//epL7J78+k+TftmuWoWZYy8uS3J7kviQPJLl4EXNOM8M6zkxy22QNdyQ5YxFzTpPkhiSHkjx4lK8nyW9O1vlAknN3esZZzLCOVya5K8n/JLlmp+c7FjOs5ccnfxcPJPlYku/c6RlnMcM6Lpms4f4ka0m+b6dnfIaqGv0XGy9q/j3wcuBE4JPAOVtc/+1svAi6LfNs91rYeHHjpyenzwE+v+i551zHHwFXTU6fD3xo0XMfZS2vA84FHjzK1y8G/pKNYxXOA+5e9MxzruPFwHcBvwxcs+h5B67lNcBpk9MXNf47OZX/f+3wVcDDi5x3u3bgx3qY/eXAjds0y1CzrKWAF0xOfwPL+X74WdZxDnDb5PTtR/j6UqiqO4Evb3GVS4AP1oaPAy9M8pKdmW5209ZRVYeq6hPAUzs31XxmWMvHqupfJ2c/zsZxI0tnhnX8R03qDZzCEQ5e3EnbFfDTgS8cdv7g5LJnSXImcBbw0W2aZahZ1rIPuCLJQeAv2HhEsWxmWccngUsnp98EPD/Ji3ZgtrHN/O9PC3E1G4+QWkrypiQPA38O/OQiZ9mugM90mP3EZcDNVfW/2zTLULOs5XLg/VV1BhsP3z+UZNleIJ5lHdcAr09yH/B64IvA09s92DY4ln9/2kFJfoCNgP/8omeZV1X9SVW9Engj8EuLnGW7fiLPsRxmfxnwM9s0xxhmWcvVwIUAVXVXkpPY+OCbQzsy4WymrqOqHgN+FCDJqcClVfXEjk04Hj/mYQkleRVwPXBRVf3LoucZqqruTPKtSXZX1UI+5Gq7dokzHWaf5GzgNOCubZpjDLOs5VHgAoAk3w6cBKzv6JTTTV1Hkt2HPXJ4D3DDDs84lluBn5i8G+U84ImqenzRQz2XJXkZcAtwZVV9ZtHzzCvJtyXJ5PS5bLwhYGHfjLZlB15HOcw+yS8Ca1X1tXBcDtx02IsCS2fGtbwb+L0k72Ljofpblm1NM67j+4FfSVLAnSzpI6MkN7Ix6+7J6w7XAc8DqKrfZeN1iIuBR4D/At66mEm3Nm0dSb4ZWGPjBfKvJnknG+8c+sqCRj6qGf5OfgF4EfDbk/49XUv2yX4w0zouZWNz8BTw38CPLfL/uofSS1JTy/ZCmyRpRgZckpoy4JLUlAGXpKYMuCQ1ZcAlqSkDLklN/R/Am6NWpaBm4QAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "n1, bins1, patches1 = plt.hist(mean_price_ratio_univ, 20, density=True, facecolor='blue', alpha=0.5)\n", "(mu_unitown, sigma_unitown) = norm.fit(mean_price_ratio_univ)\n", "mu_unitown,sigma_unitown\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(0.9559364550248012, 0.07270502006329081)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD4CAYAAADFAawfAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAMPklEQVR4nO3df4zkdX3H8dfLA4JalcYbjeE8VxMkNU0VMlIriSlQzXFtNCbEQKqNhrhJjUSNqT/+aRf/8i+jTWzN9kqtrUJaBEOMv0iUEKtHnQMkcAcNUqon2hv8TU3EO1/9Y2Zx2Ztlvpub73zfN/N8JBt2b7679+aT22c++93vd8ZJBACo62ldDwAAeGqEGgCKI9QAUByhBoDiCDUAFHdGG1909+7dWVlZaeNLA8BCOnTo0KNJepMeayXUKysrGgwGbXxpAFhItv9nu8c49QEAxRFqACiOUANAcYQaAIoj1ABQHKEGgOIINQAUR6gBoDhCDQDFtXJnIpbA2lq3nw8sEXbUAFAcoQaA4gg1ABRHqAGgOEINAMURagAojlADQHGEGgCKmxpq2+fbvnvT289tv3sewwEAGtyZmOQBSa+QJNu7JH1f0s0tzwUAGNvpqY/LJH0nybYvwggAmK2dhvpKSddPesD2qu2B7cFwODz1yQAAknYQattnSXq9pH+f9HiS9ST9JP1erzer+QBg6e1kR325pDuT/G9bwwAATraTUF+lbU57AADa0yjUtp8h6bWSbmp3HADAVo1eOCDJLyU9t+VZAAATcGciABRHqAGgOEINAMURagAojlADQHGEGgCKI9QAUByhBoDiCDUAFEeoAaA4Qg0AxRFqACiOUANAcYQaAIoj1ABQHKEGgOIINQAUR6gBoDhCDQDFNX1x23Ns32j7fttHbP9R24MBAEYavbitpI9J+lKSK2yfJekZLc4EANhkaqhtP1vSayS9VZKSPC7p8XbHAgBsaHLq4yWShpL+yfZdtg/YfubWg2yv2h7YHgyHw5kPCgDLqkmoz5B0oaS/T3KBpP+T9IGtByVZT9JP0u/1ejMeEwCWV5NQH5V0NMkd449v1CjcAIA5mBrqJD+U9D3b54//6DJJh1udCgDwhKZXfVwj6dPjKz4ekvS29kYCAGzWKNRJ7pbUb3kWAMAE3JkIAMURagAojlADQHGEGgCKI9QAUByhBoDiCDUAFEeoAaA4Qg0AxRFqACiu6XN9ALO1ttbN5wKnIXbUAFAcoQaA4gg1ABRHqAGgOEINAMURagAojlADQHGNrqO2/bCkX0g6Iel4El6WCwDmZCc3vFyS5NHWJgEATMSpDwAormmoI+krtg/ZXm1zIADAkzU99XFxkkdsP0/SrbbvT3L75gPGAV+VpL179854TABYXo121EkeGf/3mKSbJV004Zj1JP0k/V6vN9spAWCJTQ217WfaftbG+5JeJ+netgcDAIw0OfXxfEk32944/jNJvtTqVACAJ0wNdZKHJL18DrMAACbg8jwAKI5QA0BxhBoAiiPUAFAcoQaA4gg1ABRHqAGgOEINAMURagAojlADQHGEGgCKI9QAUByhBoDiCDUAFEeoAaA4Qg0AxRFqACiOUANAcYQaAIprHGrbu2zfZfvzbQ4EAHiyneyo3yXpSFuDAAAmaxRq23sk/amkA+2OAwDYqumO+qOS3ifpN9sdYHvV9sD2YDgczmQ4AECDUNv+M0nHkhx6quOSrCfpJ+n3er2ZDQgAy67JjvpiSa+3/bCkGyRdavtfW50KAPCEqaFO8sEke5KsSLpS0leTvLn1yQAAkriOGgDKO2MnBye5TdJtrUwCAJiIHTUAFEeoAaA4Qg0AxRFqACiOUANAcYQaAIoj1ABQHKEGgOIINQAUR6gBoDhCDQDFEWoAKI5QA0BxO3r2PCyYtbWuJwDQADtqACiOUANAcYQaAIoj1ABQHKEGgOKmhtr22bb/0/a3bd9n+9p5DAYAGGlyed6vJF2a5DHbZ0r6uu0vJjnY8mwAADUIdZJIemz84Znjt7Q5FADgtxqdo7a9y/bdko5JujXJHROOWbU9sD0YDoeznhMAllajUCc5keQVkvZIusj27084Zj1JP0m/1+vNek4AWFo7uuojyU8l3SZpXyvTAABO0uSqj57tc8bvP13Sn0i6v+3BAAAjTa76eIGkf7a9S6Ow/1uSz7c7FgBgQ5OrPu6RdMEcZgEATMCdiQBQHKEGgOIINQAUR6gBoDhCDQDFEWoAKI5QA0BxhBoAiiPUAFAcoQaA4gg1ABRHqAGgOEINAMURagAojlADQHGEGgCKI9QAUByhBoDiCDUAFNfkVchfaPtrto/Yvs/2u+YxGABgpMmrkB+X9N4kd9p+lqRDtm9Ncrjl2QAAarCjTvKDJHeO3/+FpCOSzm17MADAyI7OUdtekXSBpDsmPLZqe2B7MBwOZzMdAKB5qG3/jqTPSnp3kp9vfTzJepJ+kn6v15vljACw1BqF2vaZGkX600luanckAMBmU3+ZaNuS/lHSkSQfaX8kYIq1tW4+F+hIkx31xZLeIulS23eP3/a3PBcAYGzqjjrJ1yV5DrMAACbgzkQAKI5QA0BxhBoAiiPUAFAcoQaA4gg1ABRHqAGgOEINAMURagAojlADQHGEGgCKI9QAUByhBoDiCDUAFEeoAaA4Qg0AxU194QAUx0tLAQuPHTUAFEeoAaC4qaG2fZ3tY7bvncdAAIAna7Kj/qSkfS3PAQDYxtRQJ7ld0o/nMAsAYIKZnaO2vWp7YHswHA5n9WUBYOnNLNRJ1pP0k/R7vd6sviwALD2u+gCA4gg1ABTX5PK86yV9U9L5to/avrr9sQAAG6beQp7kqnkMAgCYjFMfAFAcT8qE5XIqT2LFE2ChI+yoAaA4Qg0AxRFqACiOUANAcYQaAIrjqo8KuJoAwFNgRw0AxRFqACiOUANAcYQaAIoj1ABQHFd9AE3xPCHoCDtqACiOUANAcZz6mAV+rAXQInbUAFAcO+oN7IoBFNUo1Lb3SfqYpF2SDiT5cKtTAYvmVDcCbCSW2tRQ294l6eOSXivpqKRv2b4lyeG2h9sx/jEDWEBNdtQXSXowyUOSZPsGSW+Q1E6oiS1wstPx++J0nLmoJqE+V9L3Nn18VNIfbj3I9qqk1fGHj9l+4BTm2i3p0VP4/EXAGoywDqfrGlx77Sy/2um5Bjvzou0eaBJqT/iznPQHybqk9R0Mtf1faA+S9GfxtU5XrMEI68AaSKxBk8vzjkp64aaP90h6pJ1xAABbNQn1tySdZ/vFts+SdKWkW9odCwCwYeqpjyTHbb9T0pc1ujzvuiT3tTzXTE6hnOZYgxHWgTWQlnwNnJx0uhkAUAi3kANAcYQaAIrrLNS299l+wPaDtj+wzTFvsn3Y9n22PzPvGedh2jrY3mv7a7bvsn2P7f1dzNkm29fZPmb73m0et+2/Ha/RPbYvnPeMbWuwBn8+/n+/x/Y3bL983jPOw7R12HTcK22fsH3FvGbrVJK5v2n0S8nvSHqJpLMkfVvSy7Ycc56kuyT97vjj53Uxa4F1WJf0l+P3Xybp4a7nbmEdXiPpQkn3bvP4fklf1Oia/ldJuqPrmTtYg1dv+l64fBHXoMk6jI/ZJemrkr4g6YquZ57HW1c76iduS0/yuKSN29I3e7ukjyf5iSQlOTbnGeehyTpE0rPH7z9HC3gNe5LbJf34KQ55g6RPZeSgpHNsv2A+083HtDVI8o2N7wVJBzW6n2HhNPi3IEnXSPqspEVswkRdhXrSbennbjnmpZJeavs/bB8cP4PfommyDmuS3mz7qEY7iGvmM1opTdZpmVyt0U8YS8f2uZLeKOkTXc8yT12Fuslt6WdodPrjjyVdJemA7XNanmvemqzDVZI+mWSPRqcA/sX2sv0SuNHTGCwD25doFOr3dz1LRz4q6f1JTnQ9yDx19cIBTW5LPyrpYJJfS/rv8ZM8nafRnZKLosk6XC1pnyQl+abtszV6gpql+bFPPI2BJMn2H0g6IOnyJD/qep6O9CXdYFsafR/st308yee6HatdXe3MmtyW/jlJl0iS7d0anQp5aK5Ttq/JOnxX0mWSZPv3JJ0taTjXKbt3i6S/GF/98SpJP0vyg66HmifbeyXdJOktSf6r63m6kuTFSVaSrEi6UdI7Fj3SUkc76mxzW7rtD0kaJLll/NjrbB+WdELSXy3aLqLhOrxX0j/Yfo9GP+6/NeNffS8K29drdIpr9/hc/N9IOlOSknxCo3Pz+yU9KOmXkt7WzaTtabAGfy3puZL+brybPJ4FfDa5BuuwlLiFHACKW7ZfSgHAaYdQA0BxhBoAiiPUAFAcoQaA4gg1ABRHqAGguP8HQ1Cg3SiYQkwAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "n2, bins2, patches2 = plt.hist(mean_price_ratio_no_univ, 20, density=True, facecolor='red', alpha=0.5)\n", "(mu_non_unitown, sigma_non_unitown) = norm.fit(mean_price_ratio_no_univ)\n", "mu_non_unitown,sigma_non_unitown" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Run t-test (Welch's t-test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The variance of both samples are different hence use Welch's t-test (https://en.wikipedia.org/wiki/Student%27s_t-test#Independent_two-sample_t-test)\n", "\n", "" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(True, 0.003689592222498831, 'university town')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def run_ttest():\n", "\n", " statistic, pval = ttest_ind(\n", " mean_price_ratio_univ, \n", " mean_price_ratio_no_univ, equal_var=False)\n", " different = pval < 0.01\n", " better = \"university town\" if mean_price_ratio_univ.mean() > mean_price_ratio_no_univ.mean() else \"non-university town\"\n", " return (different, pval, better)\n", "run_ttest()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conclusion: The university town housing prices are less affected by recession in 2008." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Few points for further exploration:\n", "### Is it true for all recession period?\n", "### Is it true for all kinds of housing (condo, apartements, houses etc.)?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }