{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "LaTeX macros (hidden cell)\n", "$\n", "\\newcommand{\\Q}{\\mathcal{Q}}\n", "\\newcommand{\\ECov}{\\boldsymbol{\\Sigma}}\n", "\\newcommand{\\EMean}{\\boldsymbol{\\mu}}\n", "\\newcommand{\\EAlpha}{\\boldsymbol{\\alpha}}\n", "\\newcommand{\\EBeta}{\\boldsymbol{\\beta}}\n", "$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Imports and configuration" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "scrolled": false }, "outputs": [], "source": [ "import sys\n", "import os\n", "import re\n", "import datetime as dt\n", "\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "from matplotlib.colors import LinearSegmentedColormap\n", "\n", "from mosek.fusion import *\n", "\n", "from notebook.services.config import ConfigManager\n", "\n", "from portfolio_tools import data_download, DataReader, compute_inputs" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.6.9 (default, Jan 26 2021, 15:33:00) \n", "[GCC 8.4.0]\n", "matplotlib: 3.3.4\n" ] } ], "source": [ "# Version checks\n", "print(sys.version)\n", "print('matplotlib: {}'.format(matplotlib.__version__))\n", "\n", "# Jupyter configuration\n", "c = ConfigManager()\n", "c.update('notebook', {\"CodeCell\": {\"cm_config\": {\"autoCloseBrackets\": False}}}) \n", "\n", "# Numpy options\n", "np.set_printoptions(precision=5, linewidth=120, suppress=True)\n", "\n", "# Pandas options\n", "pd.set_option('display.max_rows', None)\n", "\n", "# Matplotlib options\n", "plt.rcParams['figure.figsize'] = [12, 8]\n", "plt.rcParams['figure.dpi'] = 200" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Prepare input data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we load the raw data that will be used to compute the optimization input variables, the vector $\\EMean$ of expected returns and the covariance matrix $\\ECov$. The data consists of daily stock prices of $8$ stocks from the US market. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download data" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Data downloading:\n", "# If the user has an API key for alphavantage.co, then this code part will download the data. \n", "# The code can be modified to download from other sources. To be able to run the examples, \n", "# and reproduce results in the cookbook, the files have to have the following format and content:\n", "# - File name pattern: \"daily_adjusted_[TICKER].csv\", where TICKER is the symbol of a stock. \n", "# - The file contains at least columns \"timestamp\", \"adjusted_close\", and \"volume\".\n", "# - The data is daily price/volume, covering at least the period from 2016-03-18 until 2021-03-18, \n", "# - Files are for the stocks PM, LMT, MCD, MMM, AAPL, MSFT, TXN, CSCO.\n", "list_stocks = [\"PM\", \"LMT\", \"MCD\", \"MMM\", \"AAPL\", \"MSFT\", \"TXN\", \"CSCO\"]\n", "list_factors = []\n", "alphaToken = None\n", " \n", "list_tickers = list_stocks + list_factors\n", "if alphaToken is not None:\n", " data_download(list_tickers, alphaToken) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Read data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load the daily stock price data from the downloaded CSV files. The data is adjusted for splits and dividends. Then a selected time period is taken from the data." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "investment_start = \"2016-03-18\"\n", "investment_end = \"2021-03-18\"" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found data files: \n", "stock_data/daily_adjusted_AAPL.csv\n", "stock_data/daily_adjusted_PM.csv\n", "stock_data/daily_adjusted_CSCO.csv\n", "stock_data/daily_adjusted_TXN.csv\n", "stock_data/daily_adjusted_MMM.csv\n", "stock_data/daily_adjusted_IWM.csv\n", "stock_data/daily_adjusted_MCD.csv\n", "stock_data/daily_adjusted_SPY.csv\n", "stock_data/daily_adjusted_MSFT.csv\n", "stock_data/daily_adjusted_LMT.csv\n", "\n", "Using data files: \n", "stock_data/daily_adjusted_PM.csv\n", "stock_data/daily_adjusted_LMT.csv\n", "stock_data/daily_adjusted_MCD.csv\n", "stock_data/daily_adjusted_MMM.csv\n", "stock_data/daily_adjusted_AAPL.csv\n", "stock_data/daily_adjusted_MSFT.csv\n", "stock_data/daily_adjusted_TXN.csv\n", "stock_data/daily_adjusted_CSCO.csv\n", "\n" ] } ], "source": [ "# The files are in \"stock_data\" folder, named as \"daily_adjusted_[TICKER].csv\"\n", "dr = DataReader(folder_path=\"stock_data\", symbol_list=list_tickers)\n", "dr.read_data()\n", "df_prices, _ = dr.get_period(start_date=investment_start, end_date=investment_end)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Run the optimization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define the optimization model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we implement the optimization model in Fusion API. We create it inside a function so we can call it later." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def EfficientFrontier(N, m, G, deltas):\n", "\n", " with Model(\"Case study\") as M:\n", " # Settings\n", " #M.setLogHandler(sys.stdout)\n", " \n", " # Variables \n", " # The variable x is the fraction of holdings in each security. \n", " # It is restricted to be positive, which imposes the constraint of no short-selling. \n", " x = M.variable(\"x\", N, Domain.greaterThan(0.0))\n", " \n", " # The variable s models the portfolio variance term in the objective.\n", " s = M.variable(\"s\", 1, Domain.unbounded())\n", " \n", " # Budget constraint\n", " M.constraint('budget', Expr.sum(x), Domain.equalsTo(1.0))\n", " \n", " # Objective (quadratic utility version)\n", " delta = M.parameter()\n", " M.objective('obj', ObjectiveSense.Maximize, Expr.sub(Expr.dot(m, x), Expr.mul(delta, s)))\n", "\n", " # Conic constraint for the portfolio variance\n", " M.constraint('risk', Expr.vstack(s, Expr.mul(G.transpose(), x)), Domain.inQCone())\n", " \n", " # Create DataFrame to store the results. Last security name (the SPY) is removed.\n", " columns = [\"delta\", \"obj\", \"return\", \"risk\"] + df_prices.columns.tolist()\n", " df_result = pd.DataFrame(columns=columns)\n", " for d in deltas:\n", " # Update parameter\n", " delta.setValue(d);\n", " \n", " # Solve optimization\n", " M.solve()\n", " \n", " # Check if the solution is an optimal point\n", " solsta = M.getPrimalSolutionStatus()\n", " if (solsta != SolutionStatus.Optimal):\n", " # See https://docs.mosek.com/latest/pythonfusion/accessing-solution.html about handling solution statuses.\n", " raise Exception(\"Unexpected solution status!\") \n", " \n", " # Save results\n", " portfolio_return = m @ x.level()\n", " portfolio_risk = s.level()[0]\n", " row = pd.Series([d, M.primalObjValue(), portfolio_return, portfolio_risk] + list(x.level()), index=columns)\n", " df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)\n", "\n", " return df_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compute optimization input variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we use the loaded daily price data to compute the corresponding yearly mean return and covariance matrix." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Number of securities\n", "N = df_prices.shape[1] \n", "\n", "# Get optimization parameters\n", "m_orig, S_orig = compute_inputs(df_prices, shrinkage=False)\n", "m_shrunk, S_shrunk = compute_inputs(df_prices, shrinkage=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we compute the matrix $G$ such that $\\ECov=GG^\\mathsf{T}$, this is the input of the conic form of the optimization problem. Here we use Cholesky factorization." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "G_orig = np.linalg.cholesky(S_orig) \n", "G_shrunk = np.linalg.cholesky(S_shrunk)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Call the optimizer function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We run the optimization for a range of risk aversion parameter values: $\\delta = 10^{-1},\\dots,10^{1.5}$. We compute the efficient frontier this way both with and without using shrinkage estimation. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Compute efficient frontier with and without shrinkage\n", "deltas = np.logspace(start=-1, stop=1.5, num=20)[::-1]\n", "df_result_orig = EfficientFrontier(N, m_orig, G_orig, deltas)\n", "df_result_shrunk = EfficientFrontier(N, m_shrunk, G_shrunk, deltas)\n", "\n", "# Set small negatives to zero to make plotting work\n", "mask = df_result_orig < 0\n", "mask.iloc[:, :-8] = False\n", "df_result_orig[mask] = 0\n", "\n", "# Set small negatives to zero to make plotting work\n", "mask = df_result_shrunk < 0\n", "mask.iloc[:, :-8] = False\n", "df_result_shrunk[mask] = 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the results without shrinkage." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", " | delta | \n", "obj | \n", "return | \n", "risk | \n", "PM | \n", "LMT | \n", "MCD | \n", "MMM | \n", "AAPL | \n", "MSFT | \n", "TXN | \n", "CSCO | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "31.622777 | \n", "-6.271623 | \n", "0.175479 | \n", "0.203875 | \n", "1.028265e-01 | \n", "1.130694e-01 | \n", "3.006178e-01 | \n", "1.649833e-01 | \n", "1.859269e-09 | \n", "7.935826e-02 | \n", "5.704450e-02 | \n", "1.821003e-01 | \n", "
1 | \n", "23.357215 | \n", "-4.586052 | \n", "0.178755 | \n", "0.203997 | \n", "9.909669e-02 | \n", "1.126088e-01 | \n", "3.000666e-01 | \n", "1.591047e-01 | \n", "4.990948e-09 | \n", "8.752508e-02 | \n", "6.125147e-02 | \n", "1.803467e-01 | \n", "
2 | \n", "17.252105 | \n", "-3.340046 | \n", "0.183200 | \n", "0.204221 | \n", "9.403792e-02 | \n", "1.119876e-01 | \n", "2.993067e-01 | \n", "1.511269e-01 | \n", "1.573623e-08 | \n", "9.860880e-02 | \n", "6.696294e-02 | \n", "1.779691e-01 | \n", "
3 | \n", "12.742750 | \n", "-2.418350 | \n", "0.189246 | \n", "0.204634 | \n", "8.715666e-02 | \n", "1.111420e-01 | \n", "2.982765e-01 | \n", "1.402759e-01 | \n", "1.023495e-08 | \n", "1.136858e-01 | \n", "7.473031e-02 | \n", "1.747329e-01 | \n", "
4 | \n", "9.412050 | \n", "-1.735700 | \n", "0.197502 | \n", "0.205396 | \n", "7.775956e-02 | \n", "1.099847e-01 | \n", "2.968750e-01 | \n", "1.254593e-01 | \n", "7.721570e-09 | \n", "1.342709e-01 | \n", "8.533623e-02 | \n", "1.703143e-01 | \n", "
5 | \n", "6.951928 | \n", "-1.228921 | \n", "0.208856 | \n", "0.206817 | \n", "6.483614e-02 | \n", "1.083978e-01 | \n", "2.949460e-01 | \n", "1.050807e-01 | \n", "2.072306e-08 | \n", "1.625818e-01 | \n", "9.991932e-02 | \n", "1.642383e-01 | \n", "
6 | \n", "5.134833 | \n", "-0.851059 | \n", "0.224685 | \n", "0.209499 | \n", "4.681659e-02 | \n", "1.061779e-01 | \n", "2.922631e-01 | \n", "7.667409e-02 | \n", "1.105620e-08 | \n", "2.020499e-01 | \n", "1.202504e-01 | \n", "1.557680e-01 | \n", "
7 | \n", "3.792690 | \n", "-0.566941 | \n", "0.247845 | \n", "0.214831 | \n", "1.987726e-02 | \n", "1.033740e-01 | \n", "2.885322e-01 | \n", "3.622298e-02 | \n", "4.707309e-03 | \n", "2.559452e-01 | \n", "1.478726e-01 | \n", "1.434684e-01 | \n", "
8 | \n", "2.801357 | \n", "-0.349680 | \n", "0.277674 | \n", "0.223947 | \n", "3.664982e-09 | \n", "9.021648e-02 | \n", "2.674048e-01 | \n", "9.155908e-09 | \n", "2.610527e-02 | \n", "3.244756e-01 | \n", "1.773813e-01 | \n", "1.144165e-01 | \n", "
9 | \n", "2.069138 | \n", "-0.181326 | \n", "0.312261 | \n", "0.238547 | \n", "2.982481e-09 | \n", "5.163382e-02 | \n", "2.062153e-01 | \n", "2.615126e-10 | \n", "5.362076e-02 | \n", "4.289136e-01 | \n", "2.091390e-01 | \n", "5.047755e-02 | \n", "
10 | \n", "1.528307 | \n", "-0.045407 | \n", "0.361694 | \n", "0.266374 | \n", "4.009003e-09 | \n", "2.089985e-07 | \n", "8.067627e-02 | \n", "3.986740e-09 | \n", "9.755740e-02 | \n", "5.804044e-01 | \n", "2.413616e-01 | \n", "5.340562e-08 | \n", "
11 | \n", "1.128838 | \n", "0.065444 | \n", "0.384750 | \n", "0.282863 | \n", "5.519963e-10 | \n", "3.570274e-09 | \n", "4.261622e-09 | \n", "0.000000e+00 | \n", "1.486028e-01 | \n", "6.614851e-01 | \n", "1.899121e-01 | \n", "1.702059e-09 | \n", "
12 | \n", "0.833782 | \n", "0.150242 | \n", "0.395189 | \n", "0.293778 | \n", "1.080010e-09 | \n", "1.517056e-08 | \n", "1.099099e-08 | \n", "0.000000e+00 | \n", "2.138700e-01 | \n", "7.097808e-01 | \n", "7.634917e-02 | \n", "5.430428e-10 | \n", "
13 | \n", "0.615848 | \n", "0.215749 | \n", "0.403111 | \n", "0.304233 | \n", "7.329334e-10 | \n", "1.595340e-08 | \n", "1.294185e-09 | \n", "0.000000e+00 | \n", "2.828077e-01 | \n", "7.171919e-01 | \n", "3.424213e-07 | \n", "0.000000e+00 | \n", "
14 | \n", "0.454878 | \n", "0.265016 | \n", "0.405375 | \n", "0.308566 | \n", "0.000000e+00 | \n", "1.240693e-09 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "3.456455e-01 | \n", "6.543544e-01 | \n", "3.395459e-08 | \n", "0.000000e+00 | \n", "
15 | \n", "0.335982 | \n", "0.302123 | \n", "0.408638 | \n", "0.317024 | \n", "0.000000e+00 | \n", "7.753166e-09 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "4.361732e-01 | \n", "5.638266e-01 | \n", "1.814609e-07 | \n", "0.000000e+00 | \n", "
16 | \n", "0.248163 | \n", "0.330596 | \n", "0.413612 | \n", "0.334520 | \n", "0.000000e+00 | \n", "2.820040e-09 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "5.741893e-01 | \n", "4.258107e-01 | \n", "3.321413e-08 | \n", "0.000000e+00 | \n", "
17 | \n", "0.183298 | \n", "0.353357 | \n", "0.422248 | \n", "0.375837 | \n", "8.334403e-13 | \n", "9.239579e-11 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "8.138078e-01 | \n", "1.861922e-01 | \n", "8.563839e-10 | \n", "0.000000e+00 | \n", "
18 | \n", "0.135388 | \n", "0.372738 | \n", "0.428958 | \n", "0.415251 | \n", "0.000000e+00 | \n", "1.845514e-09 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "9.999997e-01 | \n", "2.564263e-07 | \n", "1.665269e-08 | \n", "0.000000e+00 | \n", "
19 | \n", "0.100000 | \n", "0.387433 | \n", "0.428958 | \n", "0.415251 | \n", "0.000000e+00 | \n", "6.340395e-09 | \n", "0.000000e+00 | \n", "0.000000e+00 | \n", "9.999991e-01 | \n", "8.996339e-07 | \n", "5.941383e-08 | \n", "0.000000e+00 | \n", "