{
"cells": [
{
"cell_type": "markdown",
"id": "930433a2",
"metadata": {},
"source": [
"## Theory Versus Machine Learning for Exchange Rate Predictions"
]
},
{
"cell_type": "markdown",
"id": "6d74777f",
"metadata": {},
"source": [
"### Annabella Stoll-Dansereau, Silas Kwok, [Redacted]"
]
},
{
"cell_type": "markdown",
"id": "c1123762",
"metadata": {},
"source": [
"In this project, we look at long term exchange rates and predict them using the Purchasing Power Parity theory (PPP), the Real Exchange Rate theory (RER), a linear regression, a lasso regression and two neural networks. To compare the different models, we use their predictions to decide which currencies to buy and then compare the average rate of returns. We find that the PPP model performs the best and does give positive returns in some cases, the machine learning models do not perform well and all give negative returns."
]
},
{
"cell_type": "markdown",
"id": "27bb4ae8",
"metadata": {},
"source": [
"Importing all the necessary packages."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "37d43b92",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np \n",
"import seaborn as sns\n",
"from sklearn import linear_model\n",
"from sklearn.model_selection import cross_val_score\n",
"from sklearn import tree\n",
"from sklearn.ensemble import RandomForestRegressor\n",
"from sklearn import neural_network\n",
"from sklearn import preprocessing, pipeline\n",
"from sklearn import metrics\n",
"import matplotlib.pyplot as plt\n",
"import altair as alt"
]
},
{
"cell_type": "markdown",
"id": "9287855b",
"metadata": {},
"source": [
"First we load and clean the data."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "73214efc",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"def data_comp(variables):\n",
"\n",
" '''\n",
" cleaning OECD data pass in the name of csv files you are using\n",
"\n",
" '''\n",
" for n, var in enumerate(variables):\n",
"\n",
" #read in file OECD data\n",
"\n",
" df = pd.read_csv(f\"{var}.csv\")\n",
"\n",
" #drop redundent columns and transform it into Year, Country, Variable\n",
"\n",
" df.columns = df.iloc[2]\n",
" df.drop([0,1,2], inplace=True)\n",
" df.drop(df.columns[1:44], axis=1, inplace=True)\n",
" df.set_index('Country Name', inplace = True, drop = True)\n",
" df = df.T\n",
" df.index = [round(idx) for idx in df.index]\n",
" df.index = pd.to_datetime(df.index, format='%Y')\n",
" df = df.reset_index()\n",
" df = pd.melt(df, id_vars=['index'], value_vars=df.columns[1:], var_name='Country Name', value_name='Value')\n",
"\n",
" df.set_index(['index', 'Country Name'], inplace=True)\n",
" df.index.names = [\"Year\", \"Country Name\"]\n",
" df.rename(columns={'Value': var}, inplace=True)\n",
"\n",
" #making the dataframe and combining over all variables\n",
" if n == 0: \n",
" full_df = df\n",
" else: \n",
" full_df = pd.concat([full_df, df], axis =1)\n",
" return(full_df)\n",
"\n",
"df_ML = data_comp(['GDP', 'Govt_debt_gdp', 'GDP growth rate','GNI per capita',\n",
" 'Inflation', 'Lending rate', 'Net exports', 'Total_reserve', 'Forex rate'])\n",
"\n",
"#writting to a csv\n",
"#full_df.to_csv('EX_rate_table.csv', index=True)\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "032293f8",
"metadata": {},
"source": [
"## PPP and RER"
]
},
{
"cell_type": "markdown",
"id": "830c9edb",
"metadata": {},
"source": [
"Here we construct the PPP and RER models. Based on the idea that in the long run we will converge back to an average real exchange rate value. While it doesnt always hold mean reversion seems to be the most consistent (and easiest!) way to predict exchange rates. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "469a4a40",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" RER Forex rate PPP\n",
"Year Country Name \n",
"2000-01-01 Armenia 74.010935 539.525833 0.234525\n",
"2001-01-01 Armenia 73.034791 555.078258 0.231919\n",
"2002-01-01 Armenia 73.665189 573.353333 0.226287\n",
"2003-01-01 Armenia 67.937520 578.762954 0.229938\n",
"2004-01-01 Armenia 71.634581 533.450833 0.258205\n",
"... ... ... ...\n",
"2017-01-01 Zambia 94.221086 9.517500 0.440513\n",
"2018-01-01 Zambia 88.528850 10.458333 0.420550\n",
"2019-01-01 Zambia 79.795053 12.890000 0.360806\n",
"2020-01-01 Zambia 66.438740 18.344093 0.284940\n",
"2021-01-01 Zambia 69.764269 20.018487 0.319845\n",
"\n",
"[1766 rows x 3 columns]\n"
]
}
],
"source": [
"df_PPP_model= data_comp(['RER', 'Forex rate', 'PPP'])\n",
"df_PPP = pd.read_csv(f\"PPP_US.csv\")\n",
"\n",
"df_PPP_model = df_PPP_model[df_PPP_model.notna().all(axis=1)]\n",
"print(df_PPP_model)\n",
"\n",
"#making the contructed RER which I am defining from 456 notes as RER = (forgien exchnage rate * P_in country)/ P_in US\n",
"#here P_in country/P in US == PPP\n",
"\n",
"df_PPP_model['con_RER'] = (df_PPP_model['Forex rate'] * df_PPP_model['PPP'])\n",
"\n",
"#Now we have to see how this constructed RER compares with the reported RER and NomER (Forex rate)\n",
"#theretically in the long run these would be zero since the NOM*PPP == RER_normed\n",
"#Then I divided by 100 to get the percetage if it is negative that implies currency will appriciate\n",
"#if is positive it will depreiciate\n",
"\n",
"#Clearly a lot of issues and gross assumptions\n",
"# 1. this only holds for inflation bearing countries which might be why issues arise in non advanced countries\n",
"# 2. not clear if mean reverting in long long \n",
"# 3. not clear if assumptions on PPP hold in long run\n",
"# 4. clealy issues in this not taking into consideration risks i.e. a forgein exchange might appear undervalued but that is because its risky\n",
"# 5. very US centric by norming to the US each period as the US curency itself might have issues\n",
"\n",
"df =df_PPP_model.reset_index()\n",
"#df_PPP_model.to_csv('testing1.csv', index=True)\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "5cd96ec2",
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
\n",
"
PPP_for_ER
\n",
"
\n",
"
\n",
"
Year
\n",
"
Country Name
\n",
"
\n",
"
\n",
" \n",
" \n",
"
\n",
"
1960-01-01
\n",
"
Australia
\n",
"
0.700946
\n",
"
\n",
"
\n",
"
1961-01-01
\n",
"
Australia
\n",
"
0.701324
\n",
"
\n",
"
\n",
"
1962-01-01
\n",
"
Australia
\n",
"
0.687990
\n",
"
\n",
"
\n",
"
1963-01-01
\n",
"
Australia
\n",
"
0.698321
\n",
"
\n",
"
\n",
"
1964-01-01
\n",
"
Australia
\n",
"
0.708346
\n",
"
\n",
"
\n",
"
...
\n",
"
...
\n",
"
...
\n",
"
\n",
"
\n",
"
2017-01-01
\n",
"
Cameroon
\n",
"
232.801392
\n",
"
\n",
"
\n",
"
2018-01-01
\n",
"
Cameroon
\n",
"
231.682690
\n",
"
\n",
"
\n",
"
2019-01-01
\n",
"
Cameroon
\n",
"
230.273904
\n",
"
\n",
"
\n",
"
2020-01-01
\n",
"
Cameroon
\n",
"
229.137013
\n",
"
\n",
"
\n",
"
2021-01-01
\n",
"
Cameroon
\n",
"
227.378726
\n",
"
\n",
" \n",
"
\n",
"
1765 rows × 1 columns
\n",
"
"
],
"text/plain": [
" PPP_for_ER\n",
"Year Country Name \n",
"1960-01-01 Australia 0.700946\n",
"1961-01-01 Australia 0.701324\n",
"1962-01-01 Australia 0.687990\n",
"1963-01-01 Australia 0.698321\n",
"1964-01-01 Australia 0.708346\n",
"... ...\n",
"2017-01-01 Cameroon 232.801392\n",
"2018-01-01 Cameroon 231.682690\n",
"2019-01-01 Cameroon 230.273904\n",
"2020-01-01 Cameroon 229.137013\n",
"2021-01-01 Cameroon 227.378726\n",
"\n",
"[1765 rows x 1 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df= pd.read_csv(f\"PPP_US.csv\")\n",
"\n",
"country_mapping = {\n",
"\n",
" 'ARM': 'Armenia', 'ATG': 'Antigua and Barbuda', 'AUS': 'Australia', 'BDI': 'Burundi', 'BGR': 'Bulgaria', \n",
" 'BHR': 'Bahrain', 'BHS': 'Bahamas, The', 'BLZ': 'Belize', 'BOL': 'Bolivia', 'BRA': 'Brazil', \n",
" 'CAF': 'Central African Republic', 'CAN': 'Canada', 'CHE': 'Switzerland', 'CHL': 'Chile', 'CHN': 'China', \n",
" 'CIV': \"Cote d'Ivoire\", 'CMR': 'Cameroon', 'COD': 'Congo, Dem. Rep.', 'COL': 'Colombia', 'CRI': 'Costa Rica', \n",
" 'CYP': 'Cyprus', 'CZE': 'Czechia', 'DMA': 'Dominica', 'DNK': 'Denmark', 'DOM': 'Dominican Republic', \n",
" 'DZA': 'Algeria', 'FJI': 'Fiji', 'GAB': 'Gabon', 'GBR': 'United Kingdom', 'GEO': 'Georgia', 'GHA': 'Ghana', \n",
" 'GMB': 'Gambia, The', 'GNQ': 'Equatorial Guinea', 'GRC': 'Greece', 'GRD': 'Grenada', 'GUY': 'Guyana', \n",
" 'HKG': 'Hong Kong SAR, China', 'HRV': 'Croatia', 'HUN': 'Hungary', 'IRN': 'Iran, Islamic Rep.', 'ISL': 'Iceland', \n",
" 'ISR': 'Israel', 'JPN': 'Japan', 'KNA': 'St. Kitts and Nevis', 'KOR': 'Korea, Rep.', 'LCA': 'St. Lucia', \n",
" 'LSO': 'Lesotho', 'LVA': 'Latvia', 'MAR': 'Morocco', 'MDA': 'Moldova', 'MEX': 'Mexico', 'MKD': 'North Macedonia', \n",
" 'MLT': 'Malta', 'MWI': 'Malawi', 'MYS': 'Malaysia', 'NGA': 'Nigeria', 'NIC': 'Nicaragua', 'NOR': 'Norway', \n",
" 'NZL': 'New Zealand', 'PAK': 'Pakistan', 'PHL': 'Philippines', 'PNG': 'Papua New Guinea', 'POL': 'Poland', \n",
" 'PRY': 'Paraguay', 'ROU': 'Romania', 'RUS': 'Russian Federation', 'SAU': 'Saudi Arabia', 'SGP': 'Singapore', \n",
" 'SLB': 'Solomon Islands', 'SLE': 'Sierra Leone', 'SVK': 'Slovak Republic', 'SWE': 'Sweden', 'TGO': 'Togo', \n",
" 'TTO': 'Trinidad and Tobago', 'TUN': 'Tunisia', 'UGA': 'Uganda', 'UKR': 'Ukraine', 'URY': 'Uruguay', 'USA': 'United States',\n",
" 'VCT': 'St. Vincent and the Grenadines', 'VEN': 'Venezuela, RB', 'WSM': 'Samoa', 'ZAF': 'South Africa',\n",
" 'ZMB': 'Zambia'}\n",
"\n",
"def apply_country_mapping(location):\n",
" if location in country_mapping:\n",
" return country_mapping[location]\n",
" else:\n",
" return None\n",
"\n",
"df['LOCATION'] = df['LOCATION'].apply(apply_country_mapping)\n",
"\n",
"# Extract only the necessary columns\n",
"df = df[['LOCATION', 'TIME', 'Value']]\n",
"df['TIME'] = pd.to_datetime(df['TIME'], format='%Y')\n",
"\n",
"df = df.rename(columns={'Value': 'PPP_for_ER'})\n",
"df = df.dropna()\n",
"df.set_index(['TIME', 'LOCATION'], inplace=True)\n",
"df.index.set_names(['Year', 'Country Name'], inplace=True)\n",
"df\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "61aaff23",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"float64\n"
]
}
],
"source": [
"df\n",
"df_merged = df.merge(df_PPP_model, left_index=True, right_index=True, how='inner')\n",
"\n",
"#theretically in the long run these would be zero since the NOM*PPP == RER_normed\n",
"#so now we will calculate the difference between the constructed Real exchange rate where RER = PPP*NER\n",
"#versus the calculated PPP\n",
"\n",
"#if the PPP*nom is higher than the PPP based on the exchange rate the currency would be overvalued\n",
"#id PPP *nom is lower than PPP the currency would be undervalued\n",
"\n",
"#a lot of issues because I dont know the methodology of how the real excahnge rate of PPP was calculated for the country which I grabbed\n",
"#from OCED but still shows what we want to buy or sell depending on the year\n",
"\n",
"df_merged['Final_diff'] = df_merged['con_RER'] - df_merged['PPP_for_ER']\n",
"#filtering out unstable ones\n",
"df_merged = df_merged[df_merged['Final_diff']>-0.1]\n",
"df_merged = df_merged[df_merged['Final_diff']<0.1]\n",
"\n",
"print(df_merged['Final_diff'].dtype)\n",
"df_merged.sort_index(inplace=True)\n",
"\n",
"# Select all rows with years before 2015 and including 2015\n",
"df_year_before_2015 = df_merged.loc[(slice(None, '2015-01-01'), slice(None)), :]\n",
"\n",
"\n",
"\n",
"\n",
"df_merged['diff_in_ave_RER'] = df_merged['RER'].div(df_year_before_2015.groupby('Country Name').mean()['RER'])\n",
"\n",
"df_years={}\n",
"for yr in range(2000, 2022):\n",
" name = f'df_{yr}'\n",
" dt_year= pd.to_datetime(f'{yr}-01-01')\n",
" df_year = df_merged.loc[(dt_year, slice(None))]\n",
" df_years[yr] = df_year\n",
"\n",
"\n",
"#ones below 1 short if the 'diff in ave RER' agrees\n"
]
},
{
"cell_type": "markdown",
"id": "1899ed89",
"metadata": {},
"source": [
"Below we calculate the returns from the RER model."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "821afc2e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Average Returns for 1 year holding period: 0.01100000000000001\n",
"Average Returns for 2 year holding period: 0.17499999999999996\n",
"Average Returns for 3 year holding period: -0.7763333333333334\n",
"Average Returns for 4 year holding period: -2.1503333333333337\n"
]
}
],
"source": [
"# Define the holdings() function\n",
"\n",
"\n",
"def holdings(row):\n",
" if row['diff_in_ave_RER'] > 1:\n",
" loan_amount = 1000 * row['Forex rate']\n",
" row['usd_amount'] = loan_amount/row['Forex rate']\n",
" row['home_currency_amount'] = -loan_amount\n",
" else:\n",
" loan_amount = 1000\n",
" row['usd_amount'] = -loan_amount\n",
" row['home_currency_amount'] = loan_amount * row['Forex rate']\n",
" return row\n",
"\n",
"yr_past = 2015\n",
"yr_future = 2019\n",
"\n",
"# Apply the function to the 'Amount' column in the DataFrame\n",
"#df_mean_RER_investment = df_years[yr_past].apply(holdings, axis=1)\n",
"#df_mean_RER_investment[f'{yr_future}_ER'] = df_years[yr_future]['Forex rate']\n",
"##df_mean_RER_investment['new_usd_amount'] = (1/df_mean_RER_investment[f'{yr_future}_ER'])*df_mean_RER_investment['home_currency_amount'] \n",
"#df_mean_RER_investment['return_USD'] = df_mean_RER_investment['new_usd_amount'] + df_mean_RER_investment['usd_amount']\n",
"returns = {}\n",
"\n",
"for yr_past in range(2015,2018):\n",
" list = []\n",
" for yr in range(1,5):\n",
" yr_future = yr_past+yr\n",
" df_mean_RER_investment = df_years[yr_past].apply(holdings, axis=1)\n",
" df_mean_RER_investment[f'{yr_future}_ER'] = df_years[yr_future]['Forex rate']\n",
" df_mean_RER_investment['new_usd_amount'] = (1/df_mean_RER_investment[f'{yr_future}_ER'])*df_mean_RER_investment['home_currency_amount'] \n",
" df_mean_RER_investment['return_USD'] = df_mean_RER_investment['new_usd_amount'] + df_mean_RER_investment['usd_amount']\n",
" total_investment = df_mean_RER_investment['return_USD'].count() *1000\n",
" gains = df_mean_RER_investment['return_USD'].sum()\n",
" return_to = (((gains+total_investment)/total_investment-1)*100).round(3)\n",
" list.append(return_to)\n",
" returns[yr_past] = list \n",
"\n",
"for i in range(0,4):\n",
"\n",
" elements = [v[i] for v in returns.values()]\n",
" average = sum(elements) / len(elements)\n",
"\n",
" print(f\"Average Returns for {i+1} year holding period:\", average)\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "9f240e38",
"metadata": {},
"source": [
"As we can see above the average returns for the RER model are either close to 0 or negative. This indicates this rule isnt a good investment strategy. \n",
"\n",
"Now we calculate the return for the PPP model."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "eb10cb46",
"metadata": {},
"outputs": [],
"source": [
"# Define the holdings() function\n",
"\n",
"\n",
"def holdings(row):\n",
" if row['Final_diff'] > 0:\n",
" loan_amount = 1000 * row['Forex rate']\n",
" row['usd_amount'] = loan_amount/row['Forex rate']\n",
" row['home_currency_amount'] = -loan_amount\n",
" else:\n",
" loan_amount = 1000\n",
" row['usd_amount'] = -loan_amount\n",
" row['home_currency_amount'] = loan_amount * row['Forex rate']\n",
" return row\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "0f926f90",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{2009: [2.385, 5.431, 3.62, 2.171], 2010: [1.885, 1.378, 1.67, 2.385], 2011: [2.043, 2.201, 3.096, 7.39], 2012: [0.541, 1.369, 4.757, 4.861], 2013: [-0.107, -0.924, -0.556, -0.7], 2014: [-3.203, -2.152, -2.732, -3.082], 2015: [0.017, -0.53, 0.256, 1.198], 2016: [-1.394, -1.955, -0.692, 0.744], 2017: [-0.227, -0.12, 1.197, 1.476]}\n",
"Average Returns for 1 year holding period: 0.21555555555555553\n",
"Average Returns for 2 year holding period: 0.5219999999999999\n",
"Average Returns for 3 year holding period: 1.1795555555555555\n",
"Average Returns for 4 year holding period: 1.8269999999999997\n"
]
}
],
"source": [
"returns = {}\n",
"\n",
"for yr_past in range(2009,2018):\n",
" list = []\n",
" for yr in range(1,5):\n",
" yr_future = yr_past+yr\n",
" df_mean_PPP_diff_investment = df_years[yr_past].apply(holdings, axis=1)\n",
" df_mean_PPP_diff_investment[f'{yr_future}_ER'] = df_years[yr_future]['Forex rate']\n",
" df_mean_PPP_diff_investment['new_usd_amount'] = (1/df_mean_PPP_diff_investment[f'{yr_future}_ER'])*df_mean_PPP_diff_investment['home_currency_amount'] \n",
" df_mean_PPP_diff_investment['return_USD'] = df_mean_PPP_diff_investment['new_usd_amount'] + df_mean_PPP_diff_investment['usd_amount']\n",
" total_investment = df_mean_PPP_diff_investment['return_USD'].count() *1000\n",
" gains = df_mean_PPP_diff_investment['return_USD'].sum()\n",
" return_to = (((gains+total_investment)/total_investment-1)*100).round(3)\n",
" list.append(return_to)\n",
" returns[yr_past] = list\n",
"print(returns) \n",
"\n",
"for i in range(0,4):\n",
"\n",
" elements = [v[i] for v in returns.values()]\n",
" average = sum(elements) / len(elements)\n",
"\n",
" print(f\"Average Returns for {i+1} year holding period:\", average)"
]
},
{
"cell_type": "markdown",
"id": "282e985c",
"metadata": {},
"source": [
"As we can see above the returns for the PPP model are better and are all positive. So if we systematically invested based on this PPP rule we could get a positive return (although given the risk a four year return of 1.8% is pretty terrible). "
]
},
{
"cell_type": "markdown",
"id": "64677c86",
"metadata": {},
"source": [
"## Data for Machine Learning"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "21ed15c9",
"metadata": {},
"outputs": [],
"source": [
"forex_data = pd.read_csv(\"EX_rate_table.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "fcf4db4b",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Year
\n",
"
Country Name
\n",
"
GDP
\n",
"
Govt_debt_gdp
\n",
"
GDP growth rate
\n",
"
GNI per capita
\n",
"
Inflation
\n",
"
Lending rate
\n",
"
Net exports
\n",
"
Total_reserve
\n",
"
Forex rate
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
2000-01-01
\n",
"
Aruba
\n",
"
1.873184e+09
\n",
"
NaN
\n",
"
7.616589
\n",
"
20390.0
\n",
"
4.044021
\n",
"
12.066667
\n",
"
3.097765e+08
\n",
"
235455000.0
\n",
"
1.79
\n",
"
\n",
"
\n",
"
1
\n",
"
2001-01-01
\n",
"
Aruba
\n",
"
1.896648e+09
\n",
"
NaN
\n",
"
4.195963
\n",
"
20510.0
\n",
"
2.883604
\n",
"
12.600000
\n",
"
4.304581e+08
\n",
"
321360614.5
\n",
"
1.79
\n",
"
\n",
"
\n",
"
2
\n",
"
2002-01-01
\n",
"
Aruba
\n",
"
1.962011e+09
\n",
"
NaN
\n",
"
-0.957771
\n",
"
19290.0
\n",
"
3.315247
\n",
"
11.300000
\n",
"
-1.377374e+08
\n",
"
374002933.0
\n",
"
1.79
\n",
"
\n",
"
\n",
"
3
\n",
"
2003-01-01
\n",
"
Aruba
\n",
"
2.044134e+09
\n",
"
NaN
\n",
"
1.120879
\n",
"
21120.0
\n",
"
3.656365
\n",
"
10.700000
\n",
"
-2.437989e+07
\n",
"
336945700.0
\n",
"
1.79
\n",
"
\n",
"
\n",
"
4
\n",
"
2004-01-01
\n",
"
Aruba
\n",
"
2.254749e+09
\n",
"
NaN
\n",
"
7.281026
\n",
"
24010.0
\n",
"
2.529129
\n",
"
9.600000
\n",
"
4.535251e+08
\n",
"
339224000.0
\n",
"
1.79
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Year Country Name GDP Govt_debt_gdp GDP growth rate \\\n",
"0 2000-01-01 Aruba 1.873184e+09 NaN 7.616589 \n",
"1 2001-01-01 Aruba 1.896648e+09 NaN 4.195963 \n",
"2 2002-01-01 Aruba 1.962011e+09 NaN -0.957771 \n",
"3 2003-01-01 Aruba 2.044134e+09 NaN 1.120879 \n",
"4 2004-01-01 Aruba 2.254749e+09 NaN 7.281026 \n",
"\n",
" GNI per capita Inflation Lending rate Net exports Total_reserve \\\n",
"0 20390.0 4.044021 12.066667 3.097765e+08 235455000.0 \n",
"1 20510.0 2.883604 12.600000 4.304581e+08 321360614.5 \n",
"2 19290.0 3.315247 11.300000 -1.377374e+08 374002933.0 \n",
"3 21120.0 3.656365 10.700000 -2.437989e+07 336945700.0 \n",
"4 24010.0 2.529129 9.600000 4.535251e+08 339224000.0 \n",
"\n",
" Forex rate \n",
"0 1.79 \n",
"1 1.79 \n",
"2 1.79 \n",
"3 1.79 \n",
"4 1.79 "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"forex_data.head()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "cf4d413f",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Year 5852\n",
"Country Name 5852\n",
"GDP 5609\n",
"Govt_debt_gdp 1378\n",
"GDP growth rate 5529\n",
"GNI per capita 5285\n",
"Inflation 4992\n",
"Lending rate 2806\n",
"Net exports 3884\n",
"Total_reserve 3801\n",
"Forex rate 3954\n",
"dtype: int64"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"forex_data.count()"
]
},
{
"cell_type": "markdown",
"id": "d9f20618",
"metadata": {},
"source": [
"As we can see above, we lose about a two-thirds of our dataset if we include government debt to gdp ratio. This is why we'll drop this variable."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "beead410",
"metadata": {},
"outputs": [],
"source": [
"forex_data_2 = forex_data.drop([\"Govt_debt_gdp\"], axis = 1)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "d516e4fa",
"metadata": {},
"outputs": [],
"source": [
"forex_data_2.dropna(inplace = True)"
]
},
{
"cell_type": "markdown",
"id": "415a9ad4",
"metadata": {},
"source": [
"Since in the dataset the Foreign Exchange Rate is in Local Currency Units (LCU) per US\\$, we take the reciprocal to convert it to US\\$ per LCU.\n",
"\n",
"It is natural for a country with a large GDP to have a high absolute value of net exports or total reserves, therefore we take these variables as percentages of GDP in order to control for the different sizes.\n",
"\n",
"We divide GDP by a million so that scaling is less of an issue in our model."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "d9d17ee5",
"metadata": {},
"outputs": [],
"source": [
"#Converted Net exports and total reserves as percentages of gdp to normalise across countries.\n",
"forex_data_2[\"Forex rate\"] = 1/forex_data_2[\"Forex rate\"]\n",
"forex_data_2[\"Net Exports to GDP\"] = forex_data_2[\"Net exports\"]/forex_data_2[\"GDP\"] * 100\n",
"forex_data_2[\"Total Reserve to GDP\"] = forex_data_2[\"Total_reserve\"]/forex_data_2[\"GDP\"] * 100\n",
"forex_data_2[\"GDP per mil\"] = forex_data_2[\"GDP\"]/1_000_000\n",
"forex_data_2 = forex_data_2.drop([\"Net exports\", \"Total_reserve\", \"GDP\"], axis = 1)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "126c9371",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
"
],
"text/plain": [
"LinearRegression()"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Linear regression\n",
"lr_mod = linear_model.LinearRegression()\n",
"lr_mod.fit(forex_data_3_train_X, forex_data_3_train_y)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "9a98a571",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The mean squred error in the training dataset is 0.47435291908321897\n",
"The mean squred error in the testing dataset is 0.3015103066706014\n"
]
}
],
"source": [
"mse_lr_train = metrics.mean_squared_error(forex_data_3_train_y, lr_mod.predict(forex_data_3_train_X))\n",
"mse_lr_test = metrics.mean_squared_error(forex_data_3_test_y, lr_mod.predict(forex_data_3_test_X))\n",
"print(f\"The mean squred error in the training dataset is {mse_lr_train}\")\n",
"print(f\"The mean squred error in the testing dataset is {mse_lr_test}\")"
]
},
{
"cell_type": "markdown",
"id": "62ce0594",
"metadata": {},
"source": [
"The mean squred errors for our linear regression are given above. As we can see the mean squared error is higher for our training dataset than our testing dataset."
]
},
{
"cell_type": "markdown",
"id": "743565fa",
"metadata": {},
"source": [
"Now we fit our lasso regression, we use a 5 fold cross validation in order to choose the optimal $\\alpha$ for our lasso regression."
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "fd14046b",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
LassoCV(cv=5)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LassoCV(cv=5)
"
],
"text/plain": [
"LassoCV(cv=5)"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#k=5 cv lasso\n",
"lasso_mod = linear_model.LassoCV(cv=5)\n",
"lasso_mod.fit(forex_data_3_train_X, forex_data_3_train_y.values.ravel())"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "b4972346",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The mean squred error in the training dataset is 0.49614895650008806\n",
"The mean squred error in the testing dataset is 0.23705686900933912\n"
]
}
],
"source": [
"mse_lasso_train = metrics.mean_squared_error(forex_data_3_train_y, lasso_mod.predict(forex_data_3_train_X))\n",
"mse_lasso_test = metrics.mean_squared_error(forex_data_3_test_y, lasso_mod.predict(forex_data_3_test_X))\n",
"print(f\"The mean squred error in the training dataset is {mse_lasso_train}\")\n",
"print(f\"The mean squred error in the testing dataset is {mse_lasso_test}\")"
]
},
{
"cell_type": "markdown",
"id": "37ab3d67",
"metadata": {},
"source": [
"As we can see above the mean squared error for our training dataset is higher than that of our testing dataset."
]
},
{
"cell_type": "markdown",
"id": "248ab402",
"metadata": {},
"source": [
"Now we fit our first neural network with 3 hidden layers."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "5a88de45",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
"
],
"text/plain": [
"MLPRegressor(hidden_layer_sizes=(50, 35, 20), max_iter=10000)"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#The first neural network, 3 hidden layers. \n",
"nn_mod_1 = neural_network.MLPRegressor((50, 35, 20), max_iter = 10000)\n",
"nn_mod_1.fit(forex_data_3_train_X, forex_data_3_train_y.values.ravel())"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "6a162c1b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The mean squred error in the training dataset is 0.6253439030142882\n",
"The mean squred error in the testing dataset is 0.5258991107828873\n"
]
}
],
"source": [
"mse_nn_1_train = metrics.mean_squared_error(forex_data_3_train_y, nn_mod_1.predict(forex_data_3_train_X))\n",
"mse_nn_1_test = metrics.mean_squared_error(forex_data_3_test_y, nn_mod_1.predict(forex_data_3_test_X))\n",
"print(f\"The mean squred error in the training dataset is {mse_nn_1_train}\")\n",
"print(f\"The mean squred error in the testing dataset is {mse_nn_1_test}\")"
]
},
{
"cell_type": "markdown",
"id": "ea976372",
"metadata": {},
"source": [
"As we can see above the mean squared error is roughly the same for both our training and testing datasets, however this model does suffer from input scaling. This is why below we run a scaled version of our neural network."
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "64ec9e7a",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
"
],
"text/plain": [
"Pipeline(steps=[('standardscaler', StandardScaler()),\n",
" ('mlpregressor',\n",
" MLPRegressor(hidden_layer_sizes=(50, 35, 20),\n",
" max_iter=10000))])"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#First neural netwrok but scaled. Sacled is better. \n",
"nn_mod_1_scaled = pipeline.make_pipeline(\n",
" preprocessing.StandardScaler(),\n",
" neural_network.MLPRegressor((50, 35, 20), max_iter = 10000))\n",
"nn_mod_1_scaled.fit(forex_data_3_train_X, forex_data_3_train_y.values.ravel())"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "2f95566e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The mean squred error in the training dataset is 0.054302709057648856\n",
"The mean squred error in the testing dataset is 0.40455307013410235\n"
]
}
],
"source": [
"mse_nn_1_scaled_train = metrics.mean_squared_error(forex_data_3_train_y, nn_mod_1_scaled.predict(forex_data_3_train_X))\n",
"mse_nn_1_scaled_test = metrics.mean_squared_error(forex_data_3_test_y, nn_mod_1_scaled.predict(forex_data_3_test_X))\n",
"print(f\"The mean squred error in the training dataset is {mse_nn_1_scaled_train}\")\n",
"print(f\"The mean squred error in the testing dataset is {mse_nn_1_scaled_test}\")"
]
},
{
"cell_type": "markdown",
"id": "90c52d7a",
"metadata": {},
"source": [
"As we can see above the mean squared error for our training data is much lower than that of our testing data. This could mean that our model is overfitting. "
]
},
{
"cell_type": "markdown",
"id": "5aa76948",
"metadata": {},
"source": [
"Now we run our second neural network with 4 hidden layers."
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "981c2c93",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
"
],
"text/plain": [
" Year Country Name Forex rate diff_Linear_Predict_1 \\\n",
"15 2015-01-01 Aruba 0.558659 NaN \n",
"16 2016-01-01 Aruba 0.558659 -0.018182 \n",
"59 2015-01-01 Afghanistan 0.016355 NaN \n",
"60 2016-01-01 Afghanistan 0.014735 0.049683 \n",
"103 2015-01-01 Angola 0.008329 NaN \n",
"\n",
" diff_Lasso_Predict_1 diff_NN_1_Predict_1 diff_NN_2_Predict_1 \n",
"15 NaN NaN NaN \n",
"16 0.006891 -0.412653 0.009970 \n",
"59 NaN NaN NaN \n",
"60 -0.000586 0.047096 0.082432 \n",
"103 NaN NaN NaN "
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_5.head()"
]
},
{
"cell_type": "markdown",
"id": "5573f2e9",
"metadata": {},
"source": [
"Now we define a function which tells us whether to buy or short a currency based on whether the model predicts that the exchange rate is going to be higher or lower. "
]
},
{
"cell_type": "code",
"execution_count": 65,
"id": "de558bd2",
"metadata": {},
"outputs": [],
"source": [
"def buy_or_short(x):\n",
" if x >= 0:\n",
" return \"Buy\"\n",
" elif x < 0:\n",
" return \"Short\"\n",
" else:\n",
" return \"NaN\""
]
},
{
"cell_type": "code",
"execution_count": 66,
"id": "574f812e",
"metadata": {},
"outputs": [],
"source": [
"df_10 = df_5.copy()\n",
"df_10[['diff_Linear_Predict_1', 'diff_Lasso_Predict_1', 'diff_NN_1_Predict_1', 'diff_NN_2_Predict_1']] = df_10[['diff_Linear_Predict_1', 'diff_Lasso_Predict_1', 'diff_NN_1_Predict_1', 'diff_NN_2_Predict_1']].applymap(buy_or_short)\n",
"df_11 = df_6.copy()\n",
"df_11[['diff_Linear_Predict_2', 'diff_Lasso_Predict_2', 'diff_NN_1_Predict_2', 'diff_NN_2_Predict_2']] = df_11[['diff_Linear_Predict_2', 'diff_Lasso_Predict_2', 'diff_NN_1_Predict_2', 'diff_NN_2_Predict_2']].applymap(buy_or_short)\n",
"df_12 = df_7.copy()\n",
"df_12[['diff_Linear_Predict_3', 'diff_Lasso_Predict_3', 'diff_NN_1_Predict_3', 'diff_NN_2_Predict_3']] = df_12[['diff_Linear_Predict_3', 'diff_Lasso_Predict_3', 'diff_NN_1_Predict_3', 'diff_NN_2_Predict_3']].applymap(buy_or_short)\n",
"df_13 = df_8.copy()\n",
"df_13[['diff_Linear_Predict_4', 'diff_Lasso_Predict_4', 'diff_NN_1_Predict_4', 'diff_NN_2_Predict_4']] = df_13[['diff_Linear_Predict_4', 'diff_Lasso_Predict_4', 'diff_NN_1_Predict_4', 'diff_NN_2_Predict_4']].applymap(buy_or_short)\n",
"df_14 = df_9.copy()\n",
"df_14[['diff_Linear_Predict_5', 'diff_Lasso_Predict_5', 'diff_NN_1_Predict_5', 'diff_NN_2_Predict_5']] = df_14[['diff_Linear_Predict_5', 'diff_Lasso_Predict_5', 'diff_NN_1_Predict_5', 'diff_NN_2_Predict_5']].applymap(buy_or_short)"
]
},
{
"cell_type": "code",
"execution_count": 67,
"id": "e3760ab6",
"metadata": {},
"outputs": [],
"source": [
"group = df_10.groupby('Country Name')\n",
"df_10['Forex rate lag'] = group['Forex rate'].shift(1)\n",
"df_10['Forex diff'] = df_10['Forex rate'] - df_10['Forex rate lag']"
]
},
{
"cell_type": "code",
"execution_count": 68,
"id": "0e31d341",
"metadata": {},
"outputs": [],
"source": [
"group = df_12.groupby('Country Name')\n",
"df_12['Forex rate lag'] = group['Forex rate'].shift(1)\n",
"df_12['Forex diff'] = df_12['Forex rate'] - df_12['Forex rate lag']"
]
},
{
"cell_type": "code",
"execution_count": 69,
"id": "ad2a74bf",
"metadata": {},
"outputs": [],
"source": [
"group = df_14.groupby('Country Name')\n",
"df_14['Forex rate lag'] = group['Forex rate'].shift(1)\n",
"df_14['Forex diff'] = df_14['Forex rate'] - df_14['Forex rate lag']"
]
},
{
"cell_type": "code",
"execution_count": 70,
"id": "8f3ee4ae",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Year
\n",
"
Country Name
\n",
"
Forex rate
\n",
"
diff_Linear_Predict_3
\n",
"
diff_Lasso_Predict_3
\n",
"
diff_NN_1_Predict_3
\n",
"
diff_NN_2_Predict_3
\n",
"
Forex rate lag
\n",
"
Forex diff
\n",
"
\n",
" \n",
" \n",
"
\n",
"
15
\n",
"
2015-01-01
\n",
"
Aruba
\n",
"
0.558659
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
"
\n",
"
18
\n",
"
2018-01-01
\n",
"
Aruba
\n",
"
0.558659
\n",
"
Buy
\n",
"
Buy
\n",
"
Short
\n",
"
Buy
\n",
"
0.558659
\n",
"
0.000000
\n",
"
\n",
"
\n",
"
59
\n",
"
2015-01-01
\n",
"
Afghanistan
\n",
"
0.016355
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
"
\n",
"
103
\n",
"
2015-01-01
\n",
"
Angola
\n",
"
0.008329
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
"
\n",
"
106
\n",
"
2018-01-01
\n",
"
Angola
\n",
"
0.003955
\n",
"
Buy
\n",
"
Short
\n",
"
Buy
\n",
"
Short
\n",
"
0.008329
\n",
"
-0.004374
\n",
"
\n",
"
\n",
"
125
\n",
"
2015-01-01
\n",
"
Albania
\n",
"
0.007939
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
"
\n",
"
128
\n",
"
2018-01-01
\n",
"
Albania
\n",
"
0.009260
\n",
"
Buy
\n",
"
Buy
\n",
"
Buy
\n",
"
Short
\n",
"
0.007939
\n",
"
0.001321
\n",
"
\n",
"
\n",
"
235
\n",
"
2015-01-01
\n",
"
Armenia
\n",
"
0.002092
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
"
\n",
"
238
\n",
"
2018-01-01
\n",
"
Armenia
\n",
"
0.002070
\n",
"
Buy
\n",
"
Buy
\n",
"
Buy
\n",
"
Buy
\n",
"
0.002092
\n",
"
-0.000022
\n",
"
\n",
"
\n",
"
279
\n",
"
2015-01-01
\n",
"
Antigua and Barbuda
\n",
"
0.370370
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
NaN
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Year Country Name Forex rate diff_Linear_Predict_3 \\\n",
"15 2015-01-01 Aruba 0.558659 NaN \n",
"18 2018-01-01 Aruba 0.558659 Buy \n",
"59 2015-01-01 Afghanistan 0.016355 NaN \n",
"103 2015-01-01 Angola 0.008329 NaN \n",
"106 2018-01-01 Angola 0.003955 Buy \n",
"125 2015-01-01 Albania 0.007939 NaN \n",
"128 2018-01-01 Albania 0.009260 Buy \n",
"235 2015-01-01 Armenia 0.002092 NaN \n",
"238 2018-01-01 Armenia 0.002070 Buy \n",
"279 2015-01-01 Antigua and Barbuda 0.370370 NaN \n",
"\n",
" diff_Lasso_Predict_3 diff_NN_1_Predict_3 diff_NN_2_Predict_3 \\\n",
"15 NaN NaN NaN \n",
"18 Buy Short Buy \n",
"59 NaN NaN NaN \n",
"103 NaN NaN NaN \n",
"106 Short Buy Short \n",
"125 NaN NaN NaN \n",
"128 Buy Buy Short \n",
"235 NaN NaN NaN \n",
"238 Buy Buy Buy \n",
"279 NaN NaN NaN \n",
"\n",
" Forex rate lag Forex diff \n",
"15 NaN NaN \n",
"18 0.558659 0.000000 \n",
"59 NaN NaN \n",
"103 NaN NaN \n",
"106 0.008329 -0.004374 \n",
"125 NaN NaN \n",
"128 0.007939 0.001321 \n",
"235 NaN NaN \n",
"238 0.002092 -0.000022 \n",
"279 NaN NaN "
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_12.head(10)"
]
},
{
"cell_type": "markdown",
"id": "aeb22c77",
"metadata": {},
"source": [
"Now we can calculate the average rate of return for each model for a 1 year, 3 years and 5 years holding period."
]
},
{
"cell_type": "code",
"execution_count": 71,
"id": "13396ba6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the linear regression model and 1 year holding period is -6.682001712313225\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/1716736160.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_linear_1_buy['Return_Rate'] = df_linear_1_buy['Forex diff']/df_linear_1_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for Linear reg and holding the currency for 1 year when model says to buy\n",
"\n",
"df_linear_1 = df_10[['Year', 'Country Name', 'diff_Linear_Predict_1', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_linear_1_buy = df_linear_1[df_linear_1['diff_Linear_Predict_1'].isin(['Buy'])]\n",
"\n",
"df_linear_1_buy['Return_Rate'] = df_linear_1_buy['Forex diff']/df_linear_1_buy['Forex rate lag'] * 100\n",
"\n",
"linear_1_return = df_linear_1_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the linear regression model and 1 year holding period is {linear_1_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 72,
"id": "2f05d6be",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the lasso regression model and 1 year holding period is -2.2038092077826814\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/339817703.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_lasso_1_buy['Return_Rate'] = df_lasso_1_buy['Forex diff']/df_lasso_1_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for Lasso reg and holding the currency for 1 year when model says to buy\n",
"\n",
"df_lasso_1 = df_10[['Year', 'Country Name', 'diff_Lasso_Predict_1', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_lasso_1_buy = df_lasso_1[df_lasso_1['diff_Lasso_Predict_1'].isin(['Buy'])]\n",
"\n",
"df_lasso_1_buy['Return_Rate'] = df_lasso_1_buy['Forex diff']/df_lasso_1_buy['Forex rate lag'] * 100\n",
"\n",
"lasso_1_return = df_lasso_1_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the lasso regression model and 1 year holding period is {lasso_1_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 73,
"id": "0e406bac",
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the first neural network and 1 year holding period is -4.228952725618851\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/2819496941.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_nn_1_1_buy['Return_Rate'] = df_nn_1_1_buy['Forex diff']/df_nn_1_1_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for 1st NN and holding the currency for 1 year when model says to buy\n",
"\n",
"df_nn_1_1 = df_10[['Year', 'Country Name', 'diff_NN_1_Predict_1', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_nn_1_1_buy = df_nn_1_1[df_nn_1_1['diff_NN_1_Predict_1'].isin(['Buy'])]\n",
"\n",
"df_nn_1_1_buy['Return_Rate'] = df_nn_1_1_buy['Forex diff']/df_nn_1_1_buy['Forex rate lag'] * 100\n",
"\n",
"nn_1_1_return = df_nn_1_1_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the first neural network and 1 year holding period is {nn_1_1_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 74,
"id": "98f69e1c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the second neural network and 1 year holding period is -5.661574799802205\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/2864552994.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_nn_2_1_buy['Return_Rate'] = df_nn_2_1_buy['Forex diff']/df_nn_2_1_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for 2nd NN and holding the currency for 1 year when model says to buy\n",
"\n",
"df_nn_2_1 = df_10[['Year', 'Country Name', 'diff_NN_2_Predict_1', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_nn_2_1_buy = df_nn_2_1[df_nn_2_1['diff_NN_2_Predict_1'].isin(['Buy'])]\n",
"\n",
"df_nn_2_1_buy['Return_Rate'] = df_nn_2_1_buy['Forex diff']/df_nn_2_1_buy['Forex rate lag'] * 100\n",
"\n",
"nn_2_1_return = df_nn_2_1_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the second neural network and 1 year holding period is {nn_2_1_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 75,
"id": "3de89c30",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the linear regression model and 3 years holding period is -3.979679534791286\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/3889504242.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_linear_3_buy['Return_Rate'] = df_linear_3_buy['Forex diff']/df_linear_3_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for Linear reg and holding the currency for 3 years when model says to buy\n",
"\n",
"df_linear_3 = df_12[['Year', 'Country Name', 'diff_Linear_Predict_3', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_linear_3_buy = df_linear_3[df_linear_3['diff_Linear_Predict_3'].isin(['Buy'])]\n",
"\n",
"df_linear_3_buy['Return_Rate'] = df_linear_3_buy['Forex diff']/df_linear_3_buy['Forex rate lag'] * 100\n",
"\n",
"linear_3_return = df_linear_3_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the linear regression model and 3 years holding period is {linear_3_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "9ae7e1ef",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the lasso regression model and 3 years holding period is -1.8921836297212749\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/1695144407.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_lasso_3_buy['Return_Rate'] = df_lasso_3_buy['Forex diff']/df_lasso_3_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for Lasso reg and holding the currency for 3 years when model says to buy\n",
"\n",
"df_lasso_3 = df_12[['Year', 'Country Name', 'diff_Lasso_Predict_3', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_lasso_3_buy = df_lasso_3[df_lasso_3['diff_Lasso_Predict_3'].isin(['Buy'])]\n",
"\n",
"df_lasso_3_buy['Return_Rate'] = df_lasso_3_buy['Forex diff']/df_lasso_3_buy['Forex rate lag'] * 100\n",
"\n",
"lasso_3_return = df_lasso_3_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the lasso regression model and 3 years holding period is {lasso_3_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "ad2a654a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the first neural network and 3 years holding period is -5.250983001440971\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/2278624619.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_nn_1_3_buy['Return_Rate'] = df_nn_1_3_buy['Forex diff']/df_nn_1_3_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for 1st NN and holding the currency for 3 years when model says to buy\n",
"\n",
"df_nn_1_3 = df_12[['Year', 'Country Name', 'diff_NN_1_Predict_3', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_nn_1_3_buy = df_nn_1_3[df_nn_1_3['diff_NN_1_Predict_3'].isin(['Buy'])]\n",
"\n",
"df_nn_1_3_buy['Return_Rate'] = df_nn_1_3_buy['Forex diff']/df_nn_1_3_buy['Forex rate lag'] * 100\n",
"\n",
"nn_1_3_return = df_nn_1_3_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the first neural network and 3 years holding period is {nn_1_3_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "f497e890",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the second neural networl and 3 years holding period is -5.548735188527842\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/2694513607.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_nn_2_3_buy['Return_Rate'] = df_nn_2_3_buy['Forex diff']/df_nn_2_3_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for 2nd NN and holding the currency for 3 years when model says to buy\n",
"\n",
"df_nn_2_3 = df_12[['Year', 'Country Name', 'diff_NN_2_Predict_3', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_nn_2_3_buy = df_nn_2_3[df_nn_2_3['diff_NN_2_Predict_3'].isin(['Buy'])]\n",
"\n",
"df_nn_2_3_buy['Return_Rate'] = df_nn_2_3_buy['Forex diff']/df_nn_2_3_buy['Forex rate lag'] * 100\n",
"\n",
"nn_2_3_return = df_nn_2_3_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the second neural networl and 3 years holding period is {nn_2_3_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "d55e5af0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the linear regression model and 5 years holding period is -17.218054184651734\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/1940798184.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_linear_5_buy['Return_Rate'] = df_linear_5_buy['Forex diff']/df_linear_5_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for Linear reg and holding the currency for 5 years when model says to buy\n",
"\n",
"df_linear_5 = df_14[['Year', 'Country Name', 'diff_Linear_Predict_5', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_linear_5_buy = df_linear_5[df_linear_5['diff_Linear_Predict_5'].isin(['Buy'])]\n",
"\n",
"df_linear_5_buy['Return_Rate'] = df_linear_5_buy['Forex diff']/df_linear_5_buy['Forex rate lag'] * 100\n",
"\n",
"linear_5_return = df_linear_5_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the linear regression model and 5 years holding period is {linear_5_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 80,
"id": "319104e6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the lasso regression model and 5 years holding period is -5.925652587401351\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/885340572.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_lasso_5_buy['Return_Rate'] = df_lasso_5_buy['Forex diff']/df_lasso_5_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for Lasso reg and holding the currency for 5 years when model says to buy\n",
"\n",
"df_lasso_5 = df_14[['Year', 'Country Name', 'diff_Lasso_Predict_5', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_lasso_5_buy = df_lasso_5[df_lasso_5['diff_Lasso_Predict_5'].isin(['Buy'])]\n",
"\n",
"df_lasso_5_buy['Return_Rate'] = df_lasso_5_buy['Forex diff']/df_lasso_5_buy['Forex rate lag'] * 100\n",
"\n",
"lasso_5_return = df_lasso_5_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the lasso regression model and 5 years holding period is {lasso_5_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 81,
"id": "b50085f0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the first neural network and 5 years holding period is -9.570827108863499\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/555516999.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_nn_1_5_buy['Return_Rate'] = df_nn_1_5_buy['Forex diff']/df_nn_1_5_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for 1st NN and holding the currency for 5 years when model says to buy\n",
"\n",
"df_nn_1_5 = df_14[['Year', 'Country Name', 'diff_NN_1_Predict_5', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_nn_1_5_buy = df_nn_1_5[df_nn_1_5['diff_NN_1_Predict_5'].isin(['Buy'])]\n",
"\n",
"df_nn_1_5_buy['Return_Rate'] = df_nn_1_5_buy['Forex diff']/df_nn_1_5_buy['Forex rate lag'] * 100\n",
"\n",
"nn_1_5_return = df_nn_1_5_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the first neural network and 5 years holding period is {nn_1_5_return}\")"
]
},
{
"cell_type": "code",
"execution_count": 82,
"id": "f9edea76",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The average return for the second neural network and 5 years holding period is -11.086691753890205\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/var/folders/5b/jc4253hj4rl3y0jlxy2y3trm0000gn/T/ipykernel_11083/1121885410.py:7: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_nn_2_5_buy['Return_Rate'] = df_nn_2_5_buy['Forex diff']/df_nn_2_5_buy['Forex rate lag'] * 100\n"
]
}
],
"source": [
"#Average rate of return for 2nd NN and holding the currency for 5 years when model says to buy\n",
"\n",
"df_nn_2_5 = df_14[['Year', 'Country Name', 'diff_NN_2_Predict_5', 'Forex rate','Forex rate lag', 'Forex diff']]\n",
"\n",
"df_nn_2_5_buy = df_nn_2_5[df_nn_2_5['diff_NN_2_Predict_5'].isin(['Buy'])]\n",
"\n",
"df_nn_2_5_buy['Return_Rate'] = df_nn_2_5_buy['Forex diff']/df_nn_2_5_buy['Forex rate lag'] * 100\n",
"\n",
"nn_2_5_return = df_nn_2_5_buy['Return_Rate'].mean()\n",
"\n",
"print(f\"The average return for the second neural network and 5 years holding period is {nn_2_5_return}\")"
]
},
{
"cell_type": "markdown",
"id": "daedefed",
"metadata": {},
"source": [
"Below we can see the plots of these returns."
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "77f522fd-0da3-42d5-91ab-ffc90a6df256",
"metadata": {},
"outputs": [],
"source": [
"#Preparing Returns plot\n",
"df_return_rates = pd.DataFrame({\n",
" 'Model':[\"Linear Model 1 Years Hold Return\", \"Lasso Model 1 Years Hold Return\", \"1st Neural Network 1 Years Hold Return\", \"2nd Neural Network 1 Years Hold Return\",\n",
" \"Linear Model 3 Years Hold Return\", \"Lasso Model 3 Years Hold Return\", \"1st Neural Network 3 Years Hold Return\", \"2nd Neural Network 3 Years Hold Return\",\n",
" \"Linear Model 5 Years Hold Return\", \"Lasso Model 5 Years Hold Return\", \"1st Neural Network 5 Years Hold Return\", \"2nd Neural Network 5 Years Hold Return\"],\n",
" 'Return Rate': [linear_1_return, lasso_1_return, nn_1_1_return, nn_1_3_return,\n",
" linear_3_return, lasso_3_return, nn_1_3_return, nn_2_3_return,\n",
" linear_5_return, lasso_5_return, nn_1_5_return, nn_2_5_return]\n",
"})"
]
},
{
"cell_type": "code",
"execution_count": 84,
"id": "6a8ed12e-8fbb-4d76-9601-a40fddb7b275",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Model
\n",
"
Return Rate
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
Linear Model 1 Years Hold Return
\n",
"
-6.682002
\n",
"
\n",
"
\n",
"
1
\n",
"
Lasso Model 1 Years Hold Return
\n",
"
-2.203809
\n",
"
\n",
"
\n",
"
2
\n",
"
1st Neural Network 1 Years Hold Return
\n",
"
-4.228953
\n",
"
\n",
"
\n",
"
3
\n",
"
2nd Neural Network 1 Years Hold Return
\n",
"
-5.250983
\n",
"
\n",
"
\n",
"
4
\n",
"
Linear Model 3 Years Hold Return
\n",
"
-3.979680
\n",
"
\n",
"
\n",
"
5
\n",
"
Lasso Model 3 Years Hold Return
\n",
"
-1.892184
\n",
"
\n",
"
\n",
"
6
\n",
"
1st Neural Network 3 Years Hold Return
\n",
"
-5.250983
\n",
"
\n",
"
\n",
"
7
\n",
"
2nd Neural Network 3 Years Hold Return
\n",
"
-5.548735
\n",
"
\n",
"
\n",
"
8
\n",
"
Linear Model 5 Years Hold Return
\n",
"
-17.218054
\n",
"
\n",
"
\n",
"
9
\n",
"
Lasso Model 5 Years Hold Return
\n",
"
-5.925653
\n",
"
\n",
"
\n",
"
10
\n",
"
1st Neural Network 5 Years Hold Return
\n",
"
-9.570827
\n",
"
\n",
"
\n",
"
11
\n",
"
2nd Neural Network 5 Years Hold Return
\n",
"
-11.086692
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Model Return Rate\n",
"0 Linear Model 1 Years Hold Return -6.682002\n",
"1 Lasso Model 1 Years Hold Return -2.203809\n",
"2 1st Neural Network 1 Years Hold Return -4.228953\n",
"3 2nd Neural Network 1 Years Hold Return -5.250983\n",
"4 Linear Model 3 Years Hold Return -3.979680\n",
"5 Lasso Model 3 Years Hold Return -1.892184\n",
"6 1st Neural Network 3 Years Hold Return -5.250983\n",
"7 2nd Neural Network 3 Years Hold Return -5.548735\n",
"8 Linear Model 5 Years Hold Return -17.218054\n",
"9 Lasso Model 5 Years Hold Return -5.925653\n",
"10 1st Neural Network 5 Years Hold Return -9.570827\n",
"11 2nd Neural Network 5 Years Hold Return -11.086692"
]
},
"execution_count": 84,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_return_rates"
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "83aff6fc-348e-4e99-a143-610785391671",
"metadata": {},
"outputs": [],
"source": [
"df_return_rates_1 = df_return_rates.loc[df_return_rates['Model'].isin([\"Linear Model 1 Years Hold Return\", \"Lasso Model 1 Years Hold Return\", \"1st Neural Network 1 Years Hold Return\", \"2nd Neural Network 1 Years Hold Return\"])]\n",
"df_return_rates_3 = df_return_rates.loc[df_return_rates['Model'].isin([\"Linear Model 3 Years Hold Return\", \"Lasso Model 3 Years Hold Return\", \"1st Neural Network 3 Years Hold Return\", \"2nd Neural Network 3 Years Hold Return\"])]\n",
"df_return_rates_5 = df_return_rates.loc[df_return_rates['Model'].isin([ \"Linear Model 5 Years Hold Return\", \"Lasso Model 5 Years Hold Return\", \"1st Neural Network 5 Years Hold Return\", \"2nd Neural Network 5 Years Hold Return\"])]\n"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "49e31cb2-e32b-430d-a119-43a621d4ee11",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Users/uddhav/opt/anaconda3/envs/Py_2/lib/python3.8/site-packages/altair/utils/core.py:283: FutureWarning: iteritems is deprecated and will be removed in a future version. Use .items instead.\n",
" for col_name, dtype in df.dtypes.iteritems():\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
""
],
"text/plain": [
"alt.VConcatChart(...)"
]
},
"execution_count": 86,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#A plot for the rate of returns for the different ml models\n",
"# Create a chart object and specify the data source\n",
"chart_1 = alt.Chart(df_return_rates_1)\n",
"chart_3 = alt.Chart(df_return_rates_3)\n",
"chart_5 = alt.Chart(df_return_rates_5)\n",
"\n",
"# Add a bar chart layer to display the average return rates for each model\n",
"bars_1 = chart_1.mark_bar().encode(\n",
" y='Model:N',\n",
" x='Return Rate:Q',\n",
" tooltip=[alt.Tooltip('Model:N', title=\"Model\"),\n",
" alt.Tooltip('Return Rate:Q', title=\"Return Rate\")] \n",
").properties(width=1000, height = 200, title = \"1 Year Holding Period Models\")\n",
"\n",
"bars_3 = chart_3.mark_bar().encode(\n",
" y='Model:N',\n",
" x='Return Rate:Q',\n",
" tooltip=[alt.Tooltip('Model:N', title=\"Model\"),\n",
" alt.Tooltip('Return Rate:Q', title=\"Return Rate\")] \n",
").properties(width=1000, height = 200, title = \"3 Year Holding Period Models\")\n",
"\n",
"bars_5 = chart_5.mark_bar().encode(\n",
" y='Model:N',\n",
" x='Return Rate:Q',\n",
" tooltip=[alt.Tooltip('Model:N', title=\"Model\"),\n",
" alt.Tooltip('Return Rate:Q', title=\"Return Rate\")]\n",
").properties(width=1000, height = 200, title = \"5 Year Holding Period Models\")\n",
"\n",
"# # Add a text layer to display the return rates as labels on the bars\n",
"# labels = bars.mark_text(\n",
"# align='right',\n",
"# baseline='bottom',\n",
"# dy=-2\n",
"# ).encode(\n",
"# text='Return Rate'\n",
"# )\n",
"\n",
"# Combine the layers into a single chart and display it\n",
"alt.vconcat(bars_1, bars_3, bars_5).properties(title=\"Model Return Comparison\")"
]
},
{
"cell_type": "markdown",
"id": "41b5e258",
"metadata": {},
"source": [
"As we can see across all holding periods, the lasso regression performs the best. The mediocre performance of the neural networks could be because they are overfitting and therefore performing worse out of sample, we can see this in the mean squared error results too.\n",
"\n",
"We can see from the plots above that the returns are negative for all our machine learning models, while they were positive for the PPP model. "
]
},
{
"cell_type": "markdown",
"id": "21160438",
"metadata": {},
"source": [
"It is hard to predict the exchange rate even in the long run. This project has shown that theory does perform better than machine learning (based on our measuring method) in predicting exchange rates."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}