{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "LaTeX macros (hidden cell)\n", "$\n", "\\newcommand{\\Q}{\\mathcal{Q}}\n", "\\newcommand{\\ECov}{\\boldsymbol{\\Sigma}}\n", "\\newcommand{\\EMean}{\\boldsymbol{\\mu}}\n", "\\newcommand{\\EAlpha}{\\boldsymbol{\\alpha}}\n", "\\newcommand{\\EBeta}{\\boldsymbol{\\beta}}\n", "$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Imports and configuration" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "scrolled": false }, "outputs": [], "source": [ "import sys\n", "import os\n", "import re\n", "import datetime as dt\n", "\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "from matplotlib.colors import LinearSegmentedColormap\n", "\n", "from mosek.fusion import *\n", "\n", "from notebook.services.config import ConfigManager\n", "\n", "from portfolio_tools import data_download, DataReader, compute_inputs" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.6.9 (default, Jan 26 2021, 15:33:00) \n", "[GCC 8.4.0]\n", "matplotlib: 3.3.4\n" ] } ], "source": [ "# Version checks\n", "print(sys.version)\n", "print('matplotlib: {}'.format(matplotlib.__version__))\n", "\n", "# Jupyter configuration\n", "c = ConfigManager()\n", "c.update('notebook', {\"CodeCell\": {\"cm_config\": {\"autoCloseBrackets\": False}}}) \n", "\n", "# Numpy options\n", "np.set_printoptions(precision=5, linewidth=120, suppress=True)\n", "\n", "# Pandas options\n", "pd.set_option('display.max_rows', None)\n", "\n", "# Matplotlib options\n", "plt.rcParams['figure.figsize'] = [12, 8]\n", "plt.rcParams['figure.dpi'] = 200" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Prepare input data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we load the raw data that will be used to compute the optimization input variables, the vector $\\EMean$ of expected returns and the covariance matrix $\\ECov$. The data consists of daily stock prices of $8$ stocks from the US market. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download data" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "# Data downloading:\n", "# If the user has an API key for alphavantage.co, then this code part will download the data. \n", "# The code can be modified to download from other sources. To be able to run the examples, \n", "# and reproduce results in the cookbook, the files have to have the following format and content:\n", "# - File name pattern: \"daily_adjusted_[TICKER].csv\", where TICKER is the symbol of a stock. \n", "# - The file contains at least columns \"timestamp\", \"adjusted_close\", and \"volume\".\n", "# - The data is daily price/volume, covering at least the period from 2016-03-18 until 2021-03-18, \n", "# - Files are for the stocks PM, LMT, MCD, MMM, AAPL, MSFT, TXN, CSCO.\n", "list_stocks = [\"PM\", \"LMT\", \"MCD\", \"MMM\", \"AAPL\", \"MSFT\", \"TXN\", \"CSCO\"]\n", "list_factors = []\n", "alphaToken = None\n", " \n", "list_tickers = list_stocks + list_factors\n", "if alphaToken is not None:\n", " data_download(list_tickers, alphaToken) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Read data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load the daily stock price data from the downloaded CSV files. The data is adjusted for splits and dividends. Then a selected time period is taken from the data." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "investment_start = \"2016-03-18\"\n", "investment_end = \"2021-03-18\"" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found data files: \n", "stock_data/daily_adjusted_AAPL.csv\n", "stock_data/daily_adjusted_PM.csv\n", "stock_data/daily_adjusted_CSCO.csv\n", "stock_data/daily_adjusted_TXN.csv\n", "stock_data/daily_adjusted_MMM.csv\n", "stock_data/daily_adjusted_IWM.csv\n", "stock_data/daily_adjusted_MCD.csv\n", "stock_data/daily_adjusted_SPY.csv\n", "stock_data/daily_adjusted_MSFT.csv\n", "stock_data/daily_adjusted_LMT.csv\n", "\n", "Using data files: \n", "stock_data/daily_adjusted_PM.csv\n", "stock_data/daily_adjusted_LMT.csv\n", "stock_data/daily_adjusted_MCD.csv\n", "stock_data/daily_adjusted_MMM.csv\n", "stock_data/daily_adjusted_AAPL.csv\n", "stock_data/daily_adjusted_MSFT.csv\n", "stock_data/daily_adjusted_TXN.csv\n", "stock_data/daily_adjusted_CSCO.csv\n", "\n" ] } ], "source": [ "# The files are in \"stock_data\" folder, named as \"daily_adjusted_[TICKER].csv\"\n", "dr = DataReader(folder_path=\"stock_data\", symbol_list=list_tickers)\n", "dr.read_data()\n", "df_prices, _ = dr.get_period(start_date=investment_start, end_date=investment_end)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Run the optimization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define the optimization model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we implement the optimization model in Fusion API. We create it inside a function so we can call it later." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "def RiskBudgeting(N, G, b, z, a):\n", " \n", " with Model('Risk budgeting') as M:\n", " # Settings\n", " M.setLogHandler(sys.stdout)\n", " \n", " # Portfolio weights\n", " x = M.variable(\"x\", N, Domain.unbounded())\n", " \n", " # Orthant specifier constraint\n", " M.constraint(\"orthant\", Expr.mulElm(z, x), Domain.greaterThan(0.0))\n", " \n", " # Auxiliary variables\n", " t = M.variable(\"t\", N, Domain.unbounded())\n", " s = M.variable(\"s\", 1, Domain.unbounded())\n", " \n", " # Objective function: 1/2 * x'Sx - a * b'log(z*x) becomes s - a * b't\n", " M.objective(ObjectiveSense.Minimize, Expr.sub(s, Expr.mul(a, Expr.dot(b, t))))\n", " \n", " # Bound on risk term\n", " M.constraint(Expr.vstack(s, 1, Expr.mul(G.T, x)), Domain.inRotatedQCone())\n", " \n", " # Bound on log term t <= log(z*x) becomes (z*x, 1, t) in K_exp\n", " M.constraint(Expr.hstack(Expr.mulElm(z, x), Expr.constTerm(N, 1.0), t), Domain.inPExpCone())\n", " \n", " # Create DataFrame to store the results.\n", " columns = [\"obj\", \"risk\", \"xsum\", \"bsum\"] + df_prices.columns.tolist()\n", " df_result = pd.DataFrame(columns=columns) \n", " \n", " # Solve optimization\n", " M.solve()\n", " # Check if the solution is an optimal point\n", " solsta = M.getPrimalSolutionStatus()\n", " if (solsta != SolutionStatus.Optimal):\n", " # See https://docs.mosek.com/latest/pythonfusion/accessing-solution.html about handling solution statuses.\n", " raise Exception(\"Unexpected solution status!\")\n", " \n", " # Save results\n", " xv = x.level()\n", " \n", " # Check solution quality\n", " risk_budgets = xv * np.dot(G @ G.T, xv)\n", " \n", " # Renormalize to gross exposure = 1\n", " xv = xv / np.abs(xv).sum()\n", " \n", " # Compute portfolio metrics\n", " Gx = np.dot(G.T, xv)\n", " portfolio_risk = np.sqrt(np.dot(Gx, Gx))\n", " \n", " row = pd.Series([M.primalObjValue(), portfolio_risk, np.sum(z * xv), np.sum(risk_budgets)] + list(xv), index=columns)\n", " df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)\n", " row = pd.Series([None] * 4 + list(risk_budgets), index=columns)\n", " df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)\n", "\n", " return df_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compute optimization input variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we use the loaded daily price data to compute the corresponding yearly mean return and covariance matrix." ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Number of securities\n", "N = df_prices.shape[1]\n", "\n", "# Get optimization parameters\n", "_, S = compute_inputs(df_prices)\n", "\n", "# Risk budget\n", "b = np.ones(N) / N\n", "\n", "# Orthant selector\n", "z = np.ones(N)\n", "\n", "# Global setting for sum of b\n", "a = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we compute the matrix $G$ such that $\\ECov=GG^\\mathsf{T}$, this is the input of the conic form of the optimization problem. Here we use Cholesky factorization." ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [], "source": [ "G = np.linalg.cholesky(S) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Call the optimizer function" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Problem\n", " Name : Risk budgeting \n", " Objective sense : min \n", " Type : CONIC (conic optimization problem)\n", " Constraints : 42 \n", " Cones : 9 \n", " Scalar variables : 52 \n", " Matrix variables : 0 \n", " Integer variables : 0 \n", "\n", "Optimizer started.\n", "Presolve started.\n", "Linear dependency checker started.\n", "Linear dependency checker terminated.\n", "Eliminator started.\n", "Freed constraints in eliminator : 0\n", "Eliminator terminated.\n", "Eliminator - tries : 1 time : 0.00 \n", "Lin. dep. - tries : 1 time : 0.00 \n", "Lin. dep. - number : 0 \n", "Presolve terminated. Time: 0.00 \n", "Problem\n", " Name : Risk budgeting \n", " Objective sense : min \n", " Type : CONIC (conic optimization problem)\n", " Constraints : 42 \n", " Cones : 9 \n", " Scalar variables : 52 \n", " Matrix variables : 0 \n", " Integer variables : 0 \n", "\n", "Optimizer - threads : 20 \n", "Optimizer - solved problem : the primal \n", "Optimizer - Constraints : 8\n", "Optimizer - Cones : 9\n", "Optimizer - Scalar variables : 34 conic : 34 \n", "Optimizer - Semi-definite variables: 0 scalarized : 0 \n", "Factor - setup time : 0.00 dense det. time : 0.00 \n", "Factor - ML order time : 0.00 GP order time : 0.00 \n", "Factor - nonzeros before factor : 36 after factor : 36 \n", "Factor - dense dim. : 0 flops : 7.74e+02 \n", "ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME \n", "0 1.4e+00 1.3e+00 9.7e+00 0.00e+00 1.534945180e+00 -7.147922794e+00 1.0e+00 0.01 \n", "1 2.4e-01 2.3e-01 3.9e-01 5.11e-01 2.860275193e+00 1.099946215e+00 1.8e-01 0.02 \n", "2 3.1e-02 3.0e-02 1.3e-02 1.33e+00 1.294944866e+00 1.109238006e+00 2.3e-02 0.02 \n", "3 1.6e-03 1.5e-03 1.6e-04 1.15e+00 1.052570526e+00 1.043915847e+00 1.2e-03 0.02 \n", "4 1.8e-04 1.7e-04 6.1e-06 1.01e+00 1.043789450e+00 1.042801879e+00 1.3e-04 0.02 \n", "5 2.6e-05 2.5e-05 3.4e-07 1.00e+00 1.042814807e+00 1.042672665e+00 1.9e-05 0.02 \n", "6 5.1e-06 4.8e-06 3.1e-08 1.00e+00 1.042685135e+00 1.042657303e+00 3.8e-06 0.02 \n", "7 1.4e-06 1.3e-06 4.5e-09 1.00e+00 1.042662348e+00 1.042655043e+00 9.9e-07 0.02 \n", "8 3.3e-07 3.1e-07 6.6e-10 1.00e+00 1.042656394e+00 1.042654634e+00 2.4e-07 0.02 \n", "9 3.6e-08 3.4e-08 3.0e-11 1.00e+00 1.042654983e+00 1.042654798e+00 2.6e-08 0.02 \n", "10 4.4e-09 4.2e-09 1.4e-12 1.00e+00 1.042654892e+00 1.042654869e+00 3.3e-09 0.02 \n", "11 1.5e-08 1.9e-09 5.2e-14 1.00e+00 1.042654881e+00 1.042654879e+00 3.6e-10 0.02 \n", "Optimizer terminated. Time: 0.03 \n", "\n", "\n", "Interior-point solution summary\n", " Problem status : PRIMAL_AND_DUAL_FEASIBLE\n", " Solution status : OPTIMAL\n", " Primal. obj: 1.0426548813e+00 nrm: 1e+00 Viol. con: 2e-09 var: 0e+00 cones: 5e-09 \n", " Dual. obj: 1.0426548788e+00 nrm: 1e+00 Viol. con: 0e+00 var: 1e-09 cones: 0e+00 \n" ] } ], "source": [ "df_result = RiskBudgeting(N, G, b, z, a)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
objriskxsumbsumPMLMTMCDMMMAAPLMSFTTXNCSCO
01.0426550.2129831.01.0000070.1281980.1343680.1475670.1377820.0903730.1149480.1149740.131791
1NaNNaNNaNNaN0.1250010.1250010.1250010.1250010.1250010.1250010.1250010.125001
\n", "
" ], "text/plain": [ " obj risk xsum bsum PM LMT MCD MMM \\\n", "0 1.042655 0.212983 1.0 1.000007 0.128198 0.134368 0.147567 0.137782 \n", "1 NaN NaN NaN NaN 0.125001 0.125001 0.125001 0.125001 \n", "\n", " AAPL MSFT TXN CSCO \n", "0 0.090373 0.114948 0.114974 0.131791 \n", "1 0.125001 0.125001 0.125001 0.125001 " ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize the results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the portfolio components." ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ax = df_result.iloc[0, 4:].T.plot.bar(xlabel=\"securities\", ylabel=\"x\", grid=True, rot=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the risk budgets." ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "scrolled": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ax = df_result.iloc[1, 4:].T.plot.bar(xlabel=\"securities\", ylabel=\"risk budget\", grid=True, rot=0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 2 }