{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## Data Set Construction\n",
"\n",
"**Functions**\n",
"\n",
"`pd.read_csv`, `pd.read_excel`, `np.diff` or `DataFrame.diff`, `DataFrame.resample`\n",
"\n",
"### Exercise 1\n",
"\n",
"1. Download all available daily data for the S&P 500 and the Hang Seng Index from Yahoo! Finance. \n",
"2. Import both data sets into Python. The final dataset should have a `DateTimeIndex`, and the date\n",
" column should not be part of the `DataFrame`.\n",
"3. Construct weekly price series from each, using Tuesday prices (less likely to be a holiday).\n",
"4. Construct monthly price series from each using last day in the month.\n",
"5. Save the data to the HDF file \"equity-indices.h5\".\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Open | \n",
" High | \n",
" Low | \n",
" Close | \n",
" Adj Close | \n",
" Volume | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-09-17 | \n",
" 2995.669922 | \n",
" 3006.209961 | \n",
" 2993.729980 | \n",
" 3005.699951 | \n",
" 3005.699951 | \n",
" 3671840000 | \n",
"
\n",
" \n",
" 2019-09-18 | \n",
" 3001.500000 | \n",
" 3007.830078 | \n",
" 2978.570068 | \n",
" 3006.729980 | \n",
" 3006.729980 | \n",
" 3435540000 | \n",
"
\n",
" \n",
" 2019-09-19 | \n",
" 3010.360107 | \n",
" 3021.989990 | \n",
" 3003.159912 | \n",
" 3006.790039 | \n",
" 3006.790039 | \n",
" 3251290000 | \n",
"
\n",
" \n",
" 2019-09-20 | \n",
" 3008.419922 | \n",
" 3016.370117 | \n",
" 2984.679932 | \n",
" 2992.070068 | \n",
" 2992.070068 | \n",
" 6094740000 | \n",
"
\n",
" \n",
" 2019-09-23 | \n",
" 2983.500000 | \n",
" 2999.149902 | \n",
" 2982.229980 | \n",
" 2991.780029 | \n",
" 2991.780029 | \n",
" 3186590000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Open High Low Close Adj Close \\\n",
"Date \n",
"2019-09-17 2995.669922 3006.209961 2993.729980 3005.699951 3005.699951 \n",
"2019-09-18 3001.500000 3007.830078 2978.570068 3006.729980 3006.729980 \n",
"2019-09-19 3010.360107 3021.989990 3003.159912 3006.790039 3006.790039 \n",
"2019-09-20 3008.419922 3016.370117 2984.679932 2992.070068 2992.070068 \n",
"2019-09-23 2983.500000 2999.149902 2982.229980 2991.780029 2991.780029 \n",
"\n",
" Volume \n",
"Date \n",
"2019-09-17 3671840000 \n",
"2019-09-18 3435540000 \n",
"2019-09-19 3251290000 \n",
"2019-09-20 6094740000 \n",
"2019-09-23 3186590000 "
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"\n",
"sp500 = pd.read_csv(\"data/GSPC.csv\", parse_dates=True, index_col=\"Date\")\n",
"hsi = pd.read_csv(\"data/HSI.csv\", parse_dates=True, index_col=\"Date\")\n",
"\n",
"weekly_sp500 = sp500.resample(\"W-TUE\").last()\n",
"weekly_hsi = hsi.resample(\"W-TUE\").last()\n",
"\n",
"monthly_sp500 = sp500.resample(\"M\").last()\n",
"monthly_hsi = hsi.resample(\"M\").last()\n",
"\n",
"h5file = pd.HDFStore(\"data/equity-indices.h5\", mode=\"w\")\n",
"h5file.append(\"sp500\", sp500)\n",
"h5file.append(\"weekly_sp500\", weekly_sp500)\n",
"h5file.append(\"monthly_sp500\", monthly_sp500)\n",
"h5file.append(\"hsi\", sp500)\n",
"h5file.append(\"weekly_hsi\", weekly_hsi)\n",
"h5file.append(\"monthly_hsi\", monthly_hsi)\n",
"h5file.close()\n",
"\n",
"sp500.tail()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Open | \n",
" High | \n",
" Low | \n",
" Close | \n",
" Adj Close | \n",
" Volume | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-08-27 | \n",
" 2893.139893 | \n",
" 2898.790039 | \n",
" 2860.590088 | \n",
" 2869.159912 | \n",
" 2869.159912 | \n",
" 3533630000 | \n",
"
\n",
" \n",
" 2019-09-03 | \n",
" 2909.010010 | \n",
" 2914.389893 | \n",
" 2891.850098 | \n",
" 2906.270020 | \n",
" 2906.270020 | \n",
" 3426790000 | \n",
"
\n",
" \n",
" 2019-09-10 | \n",
" 2971.010010 | \n",
" 2979.389893 | \n",
" 2957.010010 | \n",
" 2979.389893 | \n",
" 2979.389893 | \n",
" 4390770000 | \n",
"
\n",
" \n",
" 2019-09-17 | \n",
" 2995.669922 | \n",
" 3006.209961 | \n",
" 2993.729980 | \n",
" 3005.699951 | \n",
" 3005.699951 | \n",
" 3671840000 | \n",
"
\n",
" \n",
" 2019-09-24 | \n",
" 2983.500000 | \n",
" 2999.149902 | \n",
" 2982.229980 | \n",
" 2991.780029 | \n",
" 2991.780029 | \n",
" 3186590000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Open High Low Close Adj Close \\\n",
"Date \n",
"2019-08-27 2893.139893 2898.790039 2860.590088 2869.159912 2869.159912 \n",
"2019-09-03 2909.010010 2914.389893 2891.850098 2906.270020 2906.270020 \n",
"2019-09-10 2971.010010 2979.389893 2957.010010 2979.389893 2979.389893 \n",
"2019-09-17 2995.669922 3006.209961 2993.729980 3005.699951 3005.699951 \n",
"2019-09-24 2983.500000 2999.149902 2982.229980 2991.780029 2991.780029 \n",
"\n",
" Volume \n",
"Date \n",
"2019-08-27 3533630000 \n",
"2019-09-03 3426790000 \n",
"2019-09-10 4390770000 \n",
"2019-09-17 3671840000 \n",
"2019-09-24 3186590000 "
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weekly_sp500.tail()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Open | \n",
" High | \n",
" Low | \n",
" Close | \n",
" Adj Close | \n",
" Volume | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-05-31 | \n",
" 2766.149902 | \n",
" 2768.979980 | \n",
" 2750.520020 | \n",
" 2752.060059 | \n",
" 2752.060059 | \n",
" 3981020000 | \n",
"
\n",
" \n",
" 2019-06-30 | \n",
" 2932.939941 | \n",
" 2943.979980 | \n",
" 2929.050049 | \n",
" 2941.760010 | \n",
" 2941.760010 | \n",
" 5420700000 | \n",
"
\n",
" \n",
" 2019-07-31 | \n",
" 3016.219971 | \n",
" 3017.399902 | \n",
" 2958.080078 | \n",
" 2980.379883 | \n",
" 2980.379883 | \n",
" 4623430000 | \n",
"
\n",
" \n",
" 2019-08-31 | \n",
" 2937.090088 | \n",
" 2940.429932 | \n",
" 2913.320068 | \n",
" 2926.459961 | \n",
" 2926.459961 | \n",
" 3008450000 | \n",
"
\n",
" \n",
" 2019-09-30 | \n",
" 2983.500000 | \n",
" 2999.149902 | \n",
" 2982.229980 | \n",
" 2991.780029 | \n",
" 2991.780029 | \n",
" 3186590000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Open High Low Close Adj Close \\\n",
"Date \n",
"2019-05-31 2766.149902 2768.979980 2750.520020 2752.060059 2752.060059 \n",
"2019-06-30 2932.939941 2943.979980 2929.050049 2941.760010 2941.760010 \n",
"2019-07-31 3016.219971 3017.399902 2958.080078 2980.379883 2980.379883 \n",
"2019-08-31 2937.090088 2940.429932 2913.320068 2926.459961 2926.459961 \n",
"2019-09-30 2983.500000 2999.149902 2982.229980 2991.780029 2991.780029 \n",
"\n",
" Volume \n",
"Date \n",
"2019-05-31 3981020000 \n",
"2019-06-30 5420700000 \n",
"2019-07-31 4623430000 \n",
"2019-08-31 3008450000 \n",
"2019-09-30 3186590000 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"monthly_sp500.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 2\n",
"\n",
"Write a function that will correctly aggregate to weekly or monthly respecting the\n",
"aggregation rules\n",
"\n",
"* High: `max`\n",
"* Low: `min`\n",
"* Volume: `sum`\n",
"\n",
"The signature should be:\n",
"\n",
"```python\n",
"def yahoo_agg(data, freq):\n",
" \n",
" return resampled_data\n",
"```\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" High | \n",
" Low | \n",
" Volume | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-05-31 | \n",
" 2768.979980 | \n",
" 2750.520020 | \n",
" 3981020000 | \n",
"
\n",
" \n",
" 2019-06-30 | \n",
" 2943.979980 | \n",
" 2929.050049 | \n",
" 5420700000 | \n",
"
\n",
" \n",
" 2019-07-31 | \n",
" 3017.399902 | \n",
" 2958.080078 | \n",
" 4623430000 | \n",
"
\n",
" \n",
" 2019-08-31 | \n",
" 2940.429932 | \n",
" 2913.320068 | \n",
" 3008450000 | \n",
"
\n",
" \n",
" 2019-09-30 | \n",
" 2999.149902 | \n",
" 2982.229980 | \n",
" 3186590000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" High Low Volume\n",
"Date \n",
"2019-05-31 2768.979980 2750.520020 3981020000\n",
"2019-06-30 2943.979980 2929.050049 5420700000\n",
"2019-07-31 3017.399902 2958.080078 4623430000\n",
"2019-08-31 2940.429932 2913.320068 3008450000\n",
"2019-09-30 2999.149902 2982.229980 3186590000"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def yahoo_agg(data, freq):\n",
" resampler = data.resample(freq)\n",
"\n",
" high = resampler.High.max()\n",
" low = resampler.Low.min()\n",
" vol = resampler.Volume.sum()\n",
" # Start with last for all columns\n",
" resampled_data = resampler.last()\n",
" # Insert columns that use a different rule\n",
" resampled_data[\"High\"] = high\n",
" resampled_data[\"Low\"] = low\n",
" resampled_data[\"Volume\"] = vol\n",
"\n",
" return resampled_data\n",
"\n",
"\n",
"better_monthly_sp500 = yahoo_agg(sp500, \"M\")\n",
"\n",
"monthly_sp500[[\"High\", \"Low\", \"Volume\"]].tail()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" High | \n",
" Low | \n",
" Volume | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-05-31 | \n",
" 2954.129883 | \n",
" 2750.520020 | \n",
" 76860120000 | \n",
"
\n",
" \n",
" 2019-06-30 | \n",
" 2964.149902 | \n",
" 2728.810059 | \n",
" 70881390000 | \n",
"
\n",
" \n",
" 2019-07-31 | \n",
" 3027.979980 | \n",
" 2952.219971 | \n",
" 70349470000 | \n",
"
\n",
" \n",
" 2019-08-31 | \n",
" 3013.590088 | \n",
" 2822.120117 | \n",
" 79599440000 | \n",
"
\n",
" \n",
" 2019-09-30 | \n",
" 3021.989990 | \n",
" 2891.850098 | \n",
" 57236800000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" High Low Volume\n",
"Date \n",
"2019-05-31 2954.129883 2750.520020 76860120000\n",
"2019-06-30 2964.149902 2728.810059 70881390000\n",
"2019-07-31 3027.979980 2952.219971 70349470000\n",
"2019-08-31 3013.590088 2822.120117 79599440000\n",
"2019-09-30 3021.989990 2891.850098 57236800000"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"better_monthly_sp500[[\"High\", \"Low\", \"Volume\"]].tail()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 3\n",
"\n",
"1. Import the Fama-French benchmark portfolios as well as the 25 sorted portfolios at both the\n",
" monthly and daily horizon from [Ken French\"s Data Library](http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html).\n",
" **Note** It is much easier to clean to data file before importing than to find the precise\n",
" command that will load the unmodified data.\n",
"2. Import daily FX rate data for USD against AUD, Euro, JPY and GBP from the [Federal Reserve Economic Database (FRED)](http://research.stlouisfed.org/fred2/categories/94). Use Excel (xls) rather than csv files.\n",
"3. Save the data to the HDF files \"fama-french.h5\" and \"fx.h5\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" DEXJPUS DEXUSAL DEXUSEU DEXUSUK\n",
"observation_date \n",
"2019-09-16 107.94 0.6865 1.1006 1.2423\n",
"2019-09-17 108.12 0.6858 1.1063 1.2493\n",
"2019-09-18 108.17 0.6844 1.1061 1.2478\n",
"2019-09-19 108.00 0.6795 1.1046 1.2481\n",
"2019-09-20 107.95 0.6770 1.1004 1.2489\n"
]
}
],
"source": [
"yen_dollar = pd.read_excel(\n",
" \"data/DEXJPUS.xls\", index_col=\"observation_date\", skiprows=10\n",
")\n",
"dollar_aud = pd.read_excel(\n",
" \"data/DEXUSAL.xls\", index_col=\"observation_date\", skiprows=10\n",
")\n",
"dollar_euro = pd.read_excel(\n",
" \"data/DEXUSEU.xls\", index_col=\"observation_date\", skiprows=10\n",
")\n",
"dollar_pound = pd.read_excel(\n",
" \"data/DEXUSUK.xls\", index_col=\"observation_date\", skiprows=10\n",
")\n",
"\n",
"fx = pd.concat([yen_dollar, dollar_aud, dollar_euro, dollar_pound], axis=1)\n",
"print(fx.tail())\n",
"fx.to_hdf(\"data/fx.h5\", \"fx\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Mkt-RF | \n",
" SMB | \n",
" HML | \n",
" RF | \n",
" SMALL LoBM | \n",
" ME1 BM2 | \n",
" ME1 BM3 | \n",
" ME1 BM4 | \n",
" SMALL HiBM | \n",
" ME2 BM1 | \n",
" ... | \n",
" ME4 BM1 | \n",
" ME4 BM2 | \n",
" ME4 BM3 | \n",
" ME4 BM4 | \n",
" ME4 BM5 | \n",
" BIG LoBM | \n",
" ME5 BM2 | \n",
" ME5 BM3 | \n",
" ME5 BM4 | \n",
" BIG HiBM | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-03-01 | \n",
" 1.10 | \n",
" -3.13 | \n",
" -4.07 | \n",
" 0.19 | \n",
" 0.8891 | \n",
" -0.4743 | \n",
" -2.5100 | \n",
" -4.9485 | \n",
" -3.4235 | \n",
" -0.2319 | \n",
" ... | \n",
" 1.8237 | \n",
" 1.0998 | \n",
" -0.9002 | \n",
" -0.5395 | \n",
" -2.8702 | \n",
" 3.3344 | \n",
" 1.9283 | \n",
" -0.1602 | \n",
" -0.5929 | \n",
" -3.0620 | \n",
"
\n",
" \n",
" 2019-04-01 | \n",
" 3.96 | \n",
" -1.68 | \n",
" 1.93 | \n",
" 0.21 | \n",
" 0.6068 | \n",
" 1.6518 | \n",
" 2.1840 | \n",
" 3.4912 | \n",
" 2.0865 | \n",
" 2.8361 | \n",
" ... | \n",
" 3.1410 | \n",
" 4.2046 | \n",
" 4.3501 | \n",
" 5.0762 | \n",
" 4.3419 | \n",
" 4.6039 | \n",
" 2.9807 | \n",
" 3.6429 | \n",
" 5.1341 | \n",
" 9.3863 | \n",
"
\n",
" \n",
" 2019-05-01 | \n",
" -6.94 | \n",
" -1.20 | \n",
" -2.39 | \n",
" 0.21 | \n",
" -5.5586 | \n",
" -5.3126 | \n",
" -7.7829 | \n",
" -8.8025 | \n",
" -7.9767 | \n",
" -8.9190 | \n",
" ... | \n",
" -5.7334 | \n",
" -7.1667 | \n",
" -7.7153 | \n",
" -9.1637 | \n",
" -7.9580 | \n",
" -6.9663 | \n",
" -5.7641 | \n",
" -5.0654 | \n",
" -8.1314 | \n",
" -8.7586 | \n",
"
\n",
" \n",
" 2019-06-01 | \n",
" 6.93 | \n",
" 0.33 | \n",
" -1.08 | \n",
" 0.18 | \n",
" 7.6353 | \n",
" 6.2398 | \n",
" 4.7677 | \n",
" 6.7187 | \n",
" 4.7008 | \n",
" 8.9242 | \n",
" ... | \n",
" 7.2875 | \n",
" 9.4043 | \n",
" 9.5103 | \n",
" 7.5115 | \n",
" 6.9246 | \n",
" 7.7424 | \n",
" 5.8872 | \n",
" 5.9640 | \n",
" 6.6670 | \n",
" 7.9347 | \n",
"
\n",
" \n",
" 2019-07-01 | \n",
" 1.19 | \n",
" -2.06 | \n",
" 0.12 | \n",
" 0.19 | \n",
" -3.8504 | \n",
" -1.1600 | \n",
" -0.8926 | \n",
" -1.6496 | \n",
" -2.5149 | \n",
" -3.1247 | \n",
" ... | \n",
" 1.6768 | \n",
" 1.0410 | \n",
" 1.9332 | \n",
" -0.6625 | \n",
" 0.9940 | \n",
" 1.6565 | \n",
" 1.9146 | \n",
" 1.5096 | \n",
" 0.9860 | \n",
" 2.1159 | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 29 columns
\n",
"
"
],
"text/plain": [
" Mkt-RF SMB HML RF SMALL LoBM ME1 BM2 ME1 BM3 ME1 BM4 \\\n",
"2019-03-01 1.10 -3.13 -4.07 0.19 0.8891 -0.4743 -2.5100 -4.9485 \n",
"2019-04-01 3.96 -1.68 1.93 0.21 0.6068 1.6518 2.1840 3.4912 \n",
"2019-05-01 -6.94 -1.20 -2.39 0.21 -5.5586 -5.3126 -7.7829 -8.8025 \n",
"2019-06-01 6.93 0.33 -1.08 0.18 7.6353 6.2398 4.7677 6.7187 \n",
"2019-07-01 1.19 -2.06 0.12 0.19 -3.8504 -1.1600 -0.8926 -1.6496 \n",
"\n",
" SMALL HiBM ME2 BM1 ... ME4 BM1 ME4 BM2 ME4 BM3 ME4 BM4 \\\n",
"2019-03-01 -3.4235 -0.2319 ... 1.8237 1.0998 -0.9002 -0.5395 \n",
"2019-04-01 2.0865 2.8361 ... 3.1410 4.2046 4.3501 5.0762 \n",
"2019-05-01 -7.9767 -8.9190 ... -5.7334 -7.1667 -7.7153 -9.1637 \n",
"2019-06-01 4.7008 8.9242 ... 7.2875 9.4043 9.5103 7.5115 \n",
"2019-07-01 -2.5149 -3.1247 ... 1.6768 1.0410 1.9332 -0.6625 \n",
"\n",
" ME4 BM5 BIG LoBM ME5 BM2 ME5 BM3 ME5 BM4 BIG HiBM \n",
"2019-03-01 -2.8702 3.3344 1.9283 -0.1602 -0.5929 -3.0620 \n",
"2019-04-01 4.3419 4.6039 2.9807 3.6429 5.1341 9.3863 \n",
"2019-05-01 -7.9580 -6.9663 -5.7641 -5.0654 -8.1314 -8.7586 \n",
"2019-06-01 6.9246 7.7424 5.8872 5.9640 6.6670 7.9347 \n",
"2019-07-01 0.9940 1.6565 1.9146 1.5096 0.9860 2.1159 \n",
"\n",
"[5 rows x 29 columns]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# These files have all been cleaned to have only the data and headers\n",
"ff_5x5 = pd.read_csv(\"data/25_Portfolios_5x5.CSV\", index_col=0)\n",
"ff_factors = pd.read_csv(\"data/F-F_Research_Data_Factors.CSV\", index_col=0)\n",
"ff = pd.concat([ff_factors, ff_5x5], axis=1)\n",
"\n",
"dates = []\n",
"for value in ff.index:\n",
" # Values are YYYYMM\n",
" year = value // 100\n",
" month = value % 100\n",
" dates.append(pd.Timestamp(year=year, month=month, day=1))\n",
"ff.index = dates\n",
"ff.tail()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Mkt-RF | \n",
" SMB | \n",
" HML | \n",
" RF | \n",
" SMALL LoBM | \n",
" ME1 BM2 | \n",
" ME1 BM3 | \n",
" ME1 BM4 | \n",
" SMALL HiBM | \n",
" ME2 BM1 | \n",
" ... | \n",
" ME4 BM1 | \n",
" ME4 BM2 | \n",
" ME4 BM3 | \n",
" ME4 BM4 | \n",
" ME4 BM5 | \n",
" BIG LoBM | \n",
" ME5 BM2 | \n",
" ME5 BM3 | \n",
" ME5 BM4 | \n",
" BIG HiBM | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-03-31 | \n",
" 1.10 | \n",
" -3.13 | \n",
" -4.07 | \n",
" 0.19 | \n",
" 0.8891 | \n",
" -0.4743 | \n",
" -2.5100 | \n",
" -4.9485 | \n",
" -3.4235 | \n",
" -0.2319 | \n",
" ... | \n",
" 1.8237 | \n",
" 1.0998 | \n",
" -0.9002 | \n",
" -0.5395 | \n",
" -2.8702 | \n",
" 3.3344 | \n",
" 1.9283 | \n",
" -0.1602 | \n",
" -0.5929 | \n",
" -3.0620 | \n",
"
\n",
" \n",
" 2019-04-30 | \n",
" 3.96 | \n",
" -1.68 | \n",
" 1.93 | \n",
" 0.21 | \n",
" 0.6068 | \n",
" 1.6518 | \n",
" 2.1840 | \n",
" 3.4912 | \n",
" 2.0865 | \n",
" 2.8361 | \n",
" ... | \n",
" 3.1410 | \n",
" 4.2046 | \n",
" 4.3501 | \n",
" 5.0762 | \n",
" 4.3419 | \n",
" 4.6039 | \n",
" 2.9807 | \n",
" 3.6429 | \n",
" 5.1341 | \n",
" 9.3863 | \n",
"
\n",
" \n",
" 2019-05-31 | \n",
" -6.94 | \n",
" -1.20 | \n",
" -2.39 | \n",
" 0.21 | \n",
" -5.5586 | \n",
" -5.3126 | \n",
" -7.7829 | \n",
" -8.8025 | \n",
" -7.9767 | \n",
" -8.9190 | \n",
" ... | \n",
" -5.7334 | \n",
" -7.1667 | \n",
" -7.7153 | \n",
" -9.1637 | \n",
" -7.9580 | \n",
" -6.9663 | \n",
" -5.7641 | \n",
" -5.0654 | \n",
" -8.1314 | \n",
" -8.7586 | \n",
"
\n",
" \n",
" 2019-06-30 | \n",
" 6.93 | \n",
" 0.33 | \n",
" -1.08 | \n",
" 0.18 | \n",
" 7.6353 | \n",
" 6.2398 | \n",
" 4.7677 | \n",
" 6.7187 | \n",
" 4.7008 | \n",
" 8.9242 | \n",
" ... | \n",
" 7.2875 | \n",
" 9.4043 | \n",
" 9.5103 | \n",
" 7.5115 | \n",
" 6.9246 | \n",
" 7.7424 | \n",
" 5.8872 | \n",
" 5.9640 | \n",
" 6.6670 | \n",
" 7.9347 | \n",
"
\n",
" \n",
" 2019-07-31 | \n",
" 1.19 | \n",
" -2.06 | \n",
" 0.12 | \n",
" 0.19 | \n",
" -3.8504 | \n",
" -1.1600 | \n",
" -0.8926 | \n",
" -1.6496 | \n",
" -2.5149 | \n",
" -3.1247 | \n",
" ... | \n",
" 1.6768 | \n",
" 1.0410 | \n",
" 1.9332 | \n",
" -0.6625 | \n",
" 0.9940 | \n",
" 1.6565 | \n",
" 1.9146 | \n",
" 1.5096 | \n",
" 0.9860 | \n",
" 2.1159 | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 29 columns
\n",
"
"
],
"text/plain": [
" Mkt-RF SMB HML RF SMALL LoBM ME1 BM2 ME1 BM3 ME1 BM4 \\\n",
"2019-03-31 1.10 -3.13 -4.07 0.19 0.8891 -0.4743 -2.5100 -4.9485 \n",
"2019-04-30 3.96 -1.68 1.93 0.21 0.6068 1.6518 2.1840 3.4912 \n",
"2019-05-31 -6.94 -1.20 -2.39 0.21 -5.5586 -5.3126 -7.7829 -8.8025 \n",
"2019-06-30 6.93 0.33 -1.08 0.18 7.6353 6.2398 4.7677 6.7187 \n",
"2019-07-31 1.19 -2.06 0.12 0.19 -3.8504 -1.1600 -0.8926 -1.6496 \n",
"\n",
" SMALL HiBM ME2 BM1 ... ME4 BM1 ME4 BM2 ME4 BM3 ME4 BM4 \\\n",
"2019-03-31 -3.4235 -0.2319 ... 1.8237 1.0998 -0.9002 -0.5395 \n",
"2019-04-30 2.0865 2.8361 ... 3.1410 4.2046 4.3501 5.0762 \n",
"2019-05-31 -7.9767 -8.9190 ... -5.7334 -7.1667 -7.7153 -9.1637 \n",
"2019-06-30 4.7008 8.9242 ... 7.2875 9.4043 9.5103 7.5115 \n",
"2019-07-31 -2.5149 -3.1247 ... 1.6768 1.0410 1.9332 -0.6625 \n",
"\n",
" ME4 BM5 BIG LoBM ME5 BM2 ME5 BM3 ME5 BM4 BIG HiBM \n",
"2019-03-31 -2.8702 3.3344 1.9283 -0.1602 -0.5929 -3.0620 \n",
"2019-04-30 4.3419 4.6039 2.9807 3.6429 5.1341 9.3863 \n",
"2019-05-31 -7.9580 -6.9663 -5.7641 -5.0654 -8.1314 -8.7586 \n",
"2019-06-30 6.9246 7.7424 5.8872 5.9640 6.6670 7.9347 \n",
"2019-07-31 0.9940 1.6565 1.9146 1.5096 0.9860 2.1159 \n",
"\n",
"[5 rows x 29 columns]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# This is a \"trick\" to get the index to have the last day in the month.\n",
"ff = ff.resample(\"M\").last()\n",
"\n",
"ff.to_hdf(\"data/ff.h5\", \"ff\")\n",
"\n",
"ff.tail()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DatetimeIndex(['1926-07-31', '1926-08-31', '1926-09-30', '1926-10-31',\n",
" '1926-11-30', '1926-12-31', '1927-01-31', '1927-02-28',\n",
" '1927-03-31', '1927-04-30',\n",
" ...\n",
" '2018-10-31', '2018-11-30', '2018-12-31', '2019-01-31',\n",
" '2019-02-28', '2019-03-31', '2019-04-30', '2019-05-31',\n",
" '2019-06-30', '2019-07-31'],\n",
" dtype='datetime64[ns]', length=1117, freq='M')"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ff.index"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Mkt-RF | \n",
" SMB | \n",
" HML | \n",
" RF | \n",
" SMALL LoBM | \n",
" ME1 BM2 | \n",
" ME1 BM3 | \n",
" ME1 BM4 | \n",
" SMALL HiBM | \n",
" ME2 BM1 | \n",
" ... | \n",
" ME4 BM1 | \n",
" ME4 BM2 | \n",
" ME4 BM3 | \n",
" ME4 BM4 | \n",
" ME4 BM5 | \n",
" BIG LoBM | \n",
" ME5 BM2 | \n",
" ME5 BM3 | \n",
" ME5 BM4 | \n",
" BIG HiBM | \n",
"
\n",
" \n",
" \n",
" \n",
" 2019-07-25 | \n",
" -0.63 | \n",
" -0.69 | \n",
" -0.21 | \n",
" 0.009 | \n",
" -1.89 | \n",
" -1.59 | \n",
" -1.02 | \n",
" -1.11 | \n",
" -1.59 | \n",
" -1.90 | \n",
" ... | \n",
" -0.95 | \n",
" -0.72 | \n",
" -0.74 | \n",
" -0.34 | \n",
" -1.23 | \n",
" -0.62 | \n",
" -0.27 | \n",
" -0.83 | \n",
" -0.48 | \n",
" -1.68 | \n",
"
\n",
" \n",
" 2019-07-26 | \n",
" 0.82 | \n",
" 0.39 | \n",
" -0.21 | \n",
" 0.009 | \n",
" 1.53 | \n",
" 1.48 | \n",
" 1.29 | \n",
" 1.18 | \n",
" 0.68 | \n",
" 1.37 | \n",
" ... | \n",
" 0.84 | \n",
" 0.65 | \n",
" 0.96 | \n",
" 0.17 | \n",
" 0.27 | \n",
" 0.53 | \n",
" 1.54 | \n",
" 0.61 | \n",
" 0.62 | \n",
" 1.37 | \n",
"
\n",
" \n",
" 2019-07-29 | \n",
" -0.32 | \n",
" -0.32 | \n",
" -0.47 | \n",
" 0.009 | \n",
" -0.34 | \n",
" 0.16 | \n",
" -0.89 | \n",
" -0.62 | \n",
" -0.99 | \n",
" -0.82 | \n",
" ... | \n",
" -0.93 | \n",
" -0.36 | \n",
" -0.24 | \n",
" -0.86 | \n",
" -0.85 | \n",
" -0.05 | \n",
" -0.34 | \n",
" -0.32 | \n",
" -0.33 | \n",
" -0.91 | \n",
"
\n",
" \n",
" 2019-07-30 | \n",
" -0.17 | \n",
" 1.31 | \n",
" 0.86 | \n",
" 0.009 | \n",
" 1.04 | \n",
" 1.49 | \n",
" 0.84 | \n",
" 1.21 | \n",
" 2.13 | \n",
" 0.86 | \n",
" ... | \n",
" -0.22 | \n",
" -0.18 | \n",
" 0.41 | \n",
" 0.95 | \n",
" 1.25 | \n",
" -0.27 | \n",
" -0.70 | \n",
" -0.10 | \n",
" 0.16 | \n",
" -0.48 | \n",
"
\n",
" \n",
" 2019-07-31 | \n",
" -1.09 | \n",
" 0.13 | \n",
" 0.56 | \n",
" 0.009 | \n",
" -0.90 | \n",
" -1.76 | \n",
" -0.43 | \n",
" -0.73 | \n",
" -0.43 | \n",
" -1.44 | \n",
" ... | \n",
" -0.98 | \n",
" -1.30 | \n",
" -0.68 | \n",
" -0.65 | \n",
" -1.32 | \n",
" -1.24 | \n",
" -1.23 | \n",
" -1.24 | \n",
" -0.65 | \n",
" -0.64 | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 29 columns
\n",
"
"
],
"text/plain": [
" Mkt-RF SMB HML RF SMALL LoBM ME1 BM2 ME1 BM3 ME1 BM4 \\\n",
"2019-07-25 -0.63 -0.69 -0.21 0.009 -1.89 -1.59 -1.02 -1.11 \n",
"2019-07-26 0.82 0.39 -0.21 0.009 1.53 1.48 1.29 1.18 \n",
"2019-07-29 -0.32 -0.32 -0.47 0.009 -0.34 0.16 -0.89 -0.62 \n",
"2019-07-30 -0.17 1.31 0.86 0.009 1.04 1.49 0.84 1.21 \n",
"2019-07-31 -1.09 0.13 0.56 0.009 -0.90 -1.76 -0.43 -0.73 \n",
"\n",
" SMALL HiBM ME2 BM1 ... ME4 BM1 ME4 BM2 ME4 BM3 ME4 BM4 \\\n",
"2019-07-25 -1.59 -1.90 ... -0.95 -0.72 -0.74 -0.34 \n",
"2019-07-26 0.68 1.37 ... 0.84 0.65 0.96 0.17 \n",
"2019-07-29 -0.99 -0.82 ... -0.93 -0.36 -0.24 -0.86 \n",
"2019-07-30 2.13 0.86 ... -0.22 -0.18 0.41 0.95 \n",
"2019-07-31 -0.43 -1.44 ... -0.98 -1.30 -0.68 -0.65 \n",
"\n",
" ME4 BM5 BIG LoBM ME5 BM2 ME5 BM3 ME5 BM4 BIG HiBM \n",
"2019-07-25 -1.23 -0.62 -0.27 -0.83 -0.48 -1.68 \n",
"2019-07-26 0.27 0.53 1.54 0.61 0.62 1.37 \n",
"2019-07-29 -0.85 -0.05 -0.34 -0.32 -0.33 -0.91 \n",
"2019-07-30 1.25 -0.27 -0.70 -0.10 0.16 -0.48 \n",
"2019-07-31 -1.32 -1.24 -1.23 -1.24 -0.65 -0.64 \n",
"\n",
"[5 rows x 29 columns]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# These files have all been cleaned to have only the data and headers\n",
"ff_5x5_daily = pd.read_csv(\"data/25_Portfolios_5x5_daily.CSV\", index_col=0)\n",
"ff_factors_daily = pd.read_csv(\"data/F-F_Research_Data_Factors_daily.CSV\", index_col=0)\n",
"ff_daily = pd.concat([ff_factors_daily, ff_5x5_daily], axis=1)\n",
"\n",
"\n",
"dates = []\n",
"for value in ff_daily.index:\n",
" # Values are YYYYMMDD\n",
" year = value // 10000\n",
" month = (value // 100) % 100\n",
" day = value % 100\n",
" dates.append(pd.Timestamp(year=year, month=month, day=day))\n",
"ff_daily.index = dates\n",
"ff_daily.to_hdf(\"data/ff_daily.h5\", \"ff_daily\")\n",
"\n",
"ff_daily.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Exercise 3 (Alternative method)\n",
"\n",
"1. Install and use `pandas-datareader` to repeat the previous exercise.\n",
"\n",
"#### Preliminary Step\n",
"\n",
"You must first install the module using \n",
"\n",
"```\n",
"pip install pandas-datareader\n",
"``` \n",
"\n",
"from the command line. Then you can run this code. **Note**: Running this code requires access\n",
"to the internet."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" DEXJPUS | \n",
" DEXUSAL | \n",
" DEXUSEU | \n",
" DEXUSUK | \n",
"
\n",
" \n",
" DATE | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2024-02-19 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 2024-02-20 | \n",
" 149.87 | \n",
" 0.6560 | \n",
" 1.0814 | \n",
" 1.2645 | \n",
"
\n",
" \n",
" 2024-02-21 | \n",
" 150.29 | \n",
" 0.6546 | \n",
" 1.0818 | \n",
" 1.2624 | \n",
"
\n",
" \n",
" 2024-02-22 | \n",
" 150.62 | \n",
" 0.6547 | \n",
" 1.0815 | \n",
" 1.2649 | \n",
"
\n",
" \n",
" 2024-02-23 | \n",
" 150.36 | \n",
" 0.6566 | \n",
" 1.0828 | \n",
" 1.2678 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" DEXJPUS DEXUSAL DEXUSEU DEXUSUK\n",
"DATE \n",
"2024-02-19 NaN NaN NaN NaN\n",
"2024-02-20 149.87 0.6560 1.0814 1.2645\n",
"2024-02-21 150.29 0.6546 1.0818 1.2624\n",
"2024-02-22 150.62 0.6547 1.0815 1.2649\n",
"2024-02-23 150.36 0.6566 1.0828 1.2678"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas_datareader as pdr\n",
"\n",
"# Conservative start date to get all data\n",
"yen_dollar = pdr.get_data_fred(\"DEXJPUS\", start=\"1950\")\n",
"dollar_aud = pdr.get_data_fred(\"DEXUSAL\", start=\"1950\")\n",
"dollar_euro = pdr.get_data_fred(\"DEXUSEU\", start=\"1950\")\n",
"dollar_pound = pdr.get_data_fred(\"DEXUSUK\", start=\"1950\")\n",
"fx = pd.concat([yen_dollar, dollar_aud, dollar_euro, dollar_pound], axis=1)\n",
"fx.to_hdf(\"data/fx-pdr.h5\", \"fx\")\n",
"fx.tail()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:1: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_factors = pdr.get_data_famafrench(\"F-F_Research_Data_Factors\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:1: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_factors = pdr.get_data_famafrench(\"F-F_Research_Data_Factors\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"C:\\Users\\kevin\\AppData\\Local\\Temp\\ipykernel_6172\\127267269.py:2: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.\n",
" ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Mkt-RF | \n",
" SMB | \n",
" HML | \n",
" RF | \n",
" SMALL LoBM | \n",
" ME1 BM2 | \n",
" ME1 BM3 | \n",
" ME1 BM4 | \n",
" SMALL HiBM | \n",
" ME2 BM1 | \n",
" ... | \n",
" ME4 BM1 | \n",
" ME4 BM2 | \n",
" ME4 BM3 | \n",
" ME4 BM4 | \n",
" ME4 BM5 | \n",
" BIG LoBM | \n",
" ME5 BM2 | \n",
" ME5 BM3 | \n",
" ME5 BM4 | \n",
" BIG HiBM | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2023-08 | \n",
" -2.39 | \n",
" -3.16 | \n",
" -1.06 | \n",
" 0.45 | \n",
" -12.2376 | \n",
" -7.6627 | \n",
" -10.5103 | \n",
" -5.6405 | \n",
" -7.3431 | \n",
" -6.9345 | \n",
" ... | \n",
" -2.4000 | \n",
" -2.3534 | \n",
" -2.8719 | \n",
" -3.8686 | \n",
" -2.1140 | \n",
" -0.8338 | \n",
" -0.9250 | \n",
" -2.6918 | \n",
" -2.9058 | \n",
" -7.3853 | \n",
"
\n",
" \n",
" 2023-09 | \n",
" -5.24 | \n",
" -2.51 | \n",
" 1.52 | \n",
" 0.43 | \n",
" -8.2275 | \n",
" -7.6201 | \n",
" -6.5856 | \n",
" -5.8185 | \n",
" -6.0216 | \n",
" -8.8904 | \n",
" ... | \n",
" -5.6917 | \n",
" -4.6306 | \n",
" -6.5792 | \n",
" -5.0335 | \n",
" -3.4990 | \n",
" -5.5603 | \n",
" -4.3422 | \n",
" -1.4851 | \n",
" -2.1397 | \n",
" -2.2370 | \n",
"
\n",
" \n",
" 2023-10 | \n",
" -3.19 | \n",
" -3.87 | \n",
" 0.19 | \n",
" 0.47 | \n",
" -10.2240 | \n",
" -8.9443 | \n",
" -7.9196 | \n",
" -6.3774 | \n",
" -7.6783 | \n",
" -10.2040 | \n",
" ... | \n",
" -5.9518 | \n",
" -5.8296 | \n",
" -7.1202 | \n",
" -5.7789 | \n",
" -6.2824 | \n",
" -0.9824 | \n",
" -4.9262 | \n",
" -2.5643 | \n",
" -2.6864 | \n",
" -4.0377 | \n",
"
\n",
" \n",
" 2023-11 | \n",
" 8.84 | \n",
" -0.02 | \n",
" 1.64 | \n",
" 0.44 | \n",
" 5.7861 | \n",
" 8.0737 | \n",
" 10.7319 | \n",
" 8.5040 | \n",
" 7.0945 | \n",
" 10.9883 | \n",
" ... | \n",
" 9.8743 | \n",
" 8.9790 | \n",
" 9.5279 | \n",
" 8.7238 | \n",
" 7.8476 | \n",
" 9.8946 | \n",
" 7.0176 | \n",
" 7.0935 | \n",
" 8.6614 | \n",
" 15.2510 | \n",
"
\n",
" \n",
" 2023-12 | \n",
" 4.85 | \n",
" 6.35 | \n",
" 4.94 | \n",
" 0.43 | \n",
" 15.3216 | \n",
" 16.0383 | \n",
" 14.8635 | \n",
" 16.4605 | \n",
" 14.9182 | \n",
" 12.0839 | \n",
" ... | \n",
" 8.6568 | \n",
" 7.7652 | \n",
" 9.1821 | \n",
" 9.0094 | \n",
" 11.7210 | \n",
" 3.7289 | \n",
" 4.7129 | \n",
" 5.0841 | \n",
" 4.6592 | \n",
" 11.6076 | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 29 columns
\n",
"
"
],
"text/plain": [
" Mkt-RF SMB HML RF SMALL LoBM ME1 BM2 ME1 BM3 ME1 BM4 \\\n",
"Date \n",
"2023-08 -2.39 -3.16 -1.06 0.45 -12.2376 -7.6627 -10.5103 -5.6405 \n",
"2023-09 -5.24 -2.51 1.52 0.43 -8.2275 -7.6201 -6.5856 -5.8185 \n",
"2023-10 -3.19 -3.87 0.19 0.47 -10.2240 -8.9443 -7.9196 -6.3774 \n",
"2023-11 8.84 -0.02 1.64 0.44 5.7861 8.0737 10.7319 8.5040 \n",
"2023-12 4.85 6.35 4.94 0.43 15.3216 16.0383 14.8635 16.4605 \n",
"\n",
" SMALL HiBM ME2 BM1 ... ME4 BM1 ME4 BM2 ME4 BM3 ME4 BM4 \\\n",
"Date ... \n",
"2023-08 -7.3431 -6.9345 ... -2.4000 -2.3534 -2.8719 -3.8686 \n",
"2023-09 -6.0216 -8.8904 ... -5.6917 -4.6306 -6.5792 -5.0335 \n",
"2023-10 -7.6783 -10.2040 ... -5.9518 -5.8296 -7.1202 -5.7789 \n",
"2023-11 7.0945 10.9883 ... 9.8743 8.9790 9.5279 8.7238 \n",
"2023-12 14.9182 12.0839 ... 8.6568 7.7652 9.1821 9.0094 \n",
"\n",
" ME4 BM5 BIG LoBM ME5 BM2 ME5 BM3 ME5 BM4 BIG HiBM \n",
"Date \n",
"2023-08 -2.1140 -0.8338 -0.9250 -2.6918 -2.9058 -7.3853 \n",
"2023-09 -3.4990 -5.5603 -4.3422 -1.4851 -2.1397 -2.2370 \n",
"2023-10 -6.2824 -0.9824 -4.9262 -2.5643 -2.6864 -4.0377 \n",
"2023-11 7.8476 9.8946 7.0176 7.0935 8.6614 15.2510 \n",
"2023-12 11.7210 3.7289 4.7129 5.0841 4.6592 11.6076 \n",
"\n",
"[5 rows x 29 columns]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ff_factors = pdr.get_data_famafrench(\"F-F_Research_Data_Factors\", start=\"1920\")\n",
"ff_5x5 = pdr.get_data_famafrench(\"25_Portfolios_5x5\", start=\"1920\")\n",
"# The function returns all of the tables in each file. We want the first, [0]\n",
"ff_factors = ff_factors[0]\n",
"ff_5x5 = ff_5x5[0]\n",
"ff = pd.concat([ff_factors, ff_5x5], axis=1)\n",
"ff.to_hdf(\"data/ff-pdr.h5\", \"ff\")\n",
"ff.tail()"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"### Exercise 4\n",
"Download data on 1 year and 10 year US government bond rates from FRED, and \n",
"construct the term premium as the different in yields on 10 year and 1 year\n",
"bonds. Combine the two yield series and the term premium into a `DataFrame`\n",
"and save it as HDF."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" TERM | \n",
" GS10 | \n",
" GS1 | \n",
"
\n",
" \n",
" DATE | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2023-09-01 | \n",
" -1.06 | \n",
" 4.38 | \n",
" 5.44 | \n",
"
\n",
" \n",
" 2023-10-01 | \n",
" -0.62 | \n",
" 4.80 | \n",
" 5.42 | \n",
"
\n",
" \n",
" 2023-11-01 | \n",
" -0.78 | \n",
" 4.50 | \n",
" 5.28 | \n",
"
\n",
" \n",
" 2023-12-01 | \n",
" -0.94 | \n",
" 4.02 | \n",
" 4.96 | \n",
"
\n",
" \n",
" 2024-01-01 | \n",
" -0.73 | \n",
" 4.06 | \n",
" 4.79 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" TERM GS10 GS1\n",
"DATE \n",
"2023-09-01 -1.06 4.38 5.44\n",
"2023-10-01 -0.62 4.80 5.42\n",
"2023-11-01 -0.78 4.50 5.28\n",
"2023-12-01 -0.94 4.02 4.96\n",
"2024-01-01 -0.73 4.06 4.79"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# No need to import here since pandas and pandas-datareader previously imported\n",
"\n",
"# Conservative start date to get all data\n",
"gs10 = pdr.get_data_fred(\"GS10\", start=\"1950\")\n",
"gs1 = pdr.get_data_fred(\"GS1\", start=\"1950\")\n",
"\n",
"term = gs10[\"GS10\"] - gs1[\"GS1\"]\n",
"term.name = \"TERM\"\n",
"combined = pd.DataFrame([term, gs10[\"GS10\"], gs1[\"GS1\"]]).T\n",
"combined.tail()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DatetimeIndex(['1953-04-01', '1953-05-01', '1953-06-01', '1953-07-01',\n",
" '1953-08-01', '1953-09-01', '1953-10-01', '1953-11-01',\n",
" '1953-12-01', '1954-01-01',\n",
" ...\n",
" '2023-04-01', '2023-05-01', '2023-06-01', '2023-07-01',\n",
" '2023-08-01', '2023-09-01', '2023-10-01', '2023-11-01',\n",
" '2023-12-01', '2024-01-01'],\n",
" dtype='datetime64[ns]', name='DATE', length=850, freq=None)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"combined.index"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DatetimeIndex(['1953-04-01', '1953-05-01', '1953-06-01', '1953-07-01',\n",
" '1953-08-01', '1953-09-01', '1953-10-01', '1953-11-01',\n",
" '1953-12-01', '1954-01-01',\n",
" ...\n",
" '2023-04-01', '2023-05-01', '2023-06-01', '2023-07-01',\n",
" '2023-08-01', '2023-09-01', '2023-10-01', '2023-11-01',\n",
" '2023-12-01', '2024-01-01'],\n",
" dtype='datetime64[ns]', name='DATE', length=850, freq='MS')"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Trick to ensure the index has the frequency MS, Month Start\n",
"combined = combined.resample(\"MS\").last()\n",
"combined.to_hdf(\"data/term-premium.h5\", \"term_premium\")\n",
"combined.index"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
},
"pycharm": {
"stem_cell": {
"cell_type": "raw",
"metadata": {
"collapsed": false
},
"source": []
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}