{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "LaTeX macros (hidden cell)\n", "$\n", "\\newcommand{\\Q}{\\mathcal{Q}}\n", "\\newcommand{\\ECov}{\\boldsymbol{\\Sigma}}\n", "\\newcommand{\\EMean}{\\boldsymbol{\\mu}}\n", "\\newcommand{\\EAlpha}{\\boldsymbol{\\alpha}}\n", "\\newcommand{\\EBeta}{\\boldsymbol{\\beta}}\n", "$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Imports and configuration" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "%%bash\n", "FILE=/content/portfolio_tools.py\n", "if [[ ! -f $FILE ]]; then\n", " wget https://raw.githubusercontent.com/MOSEK/PortfolioOptimization/main/python/notebooks/portfolio_tools.py\n", "fi" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install mosek \n", "%env PYTHONPATH /env/python:/content\n", "%env MOSEKLM_LICENSE_FILE /content/mosek.lic:/root/mosek/mosek.lic\n", "\n", "# To execute the notebook directly in colab make sure your MOSEK license file is in one the locations\n", "#\n", "# /content/mosek.lic or /root/mosek/mosek.lic\n", "#\n", "# inside this notebook's internal filesystem. \n", "#\n", "# You will also need an API key from a stock data provider, or ready data files in a \"stock_data\" folder." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "import sys\n", "import os\n", "import re\n", "import datetime as dt\n", "\n", "import numpy as np\n", "import pandas as pd\n", "%matplotlib inline\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "from matplotlib.colors import LinearSegmentedColormap\n", "\n", "from mosek.fusion import *\n", "import mosek.fusion.pythonic # From Mosek >= 10.2\n", "\n", "from notebook.services.config import ConfigManager\n", "\n", "# portfolio_tools.py is a Mosek helper file distributed together with the notebooks\n", "from portfolio_tools import data_download, DataReader, compute_inputs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Version checks\n", "print(sys.version)\n", "print('matplotlib: {}'.format(matplotlib.__version__))\n", "\n", "# Jupyter configuration\n", "c = ConfigManager()\n", "c.update('notebook', {\"CodeCell\": {\"cm_config\": {\"autoCloseBrackets\": False}}}) \n", "\n", "# Numpy options\n", "np.set_printoptions(precision=5, linewidth=120, suppress=True)\n", "\n", "# Pandas options\n", "pd.set_option('display.max_rows', None)\n", "\n", "# Matplotlib options\n", "plt.rcParams['figure.figsize'] = [12, 8]\n", "plt.rcParams['figure.dpi'] = 200" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Prepare input data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we load the raw data that will be used to compute the optimization input variables, the vector $\\EMean$ of expected returns and the covariance matrix $\\ECov$. The data consists of daily stock prices of $8$ stocks from the US market. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Download data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Data downloading:\n", "# If the user has an API key for alphavantage.co, then this code part will download the data. \n", "# The code can be modified to download from other sources. To be able to run the examples, \n", "# and reproduce results in the cookbook, the files have to have the following format and content:\n", "# - File name pattern: \"daily_adjusted_[TICKER].csv\", where TICKER is the symbol of a stock. \n", "# - The file contains at least columns \"timestamp\", \"adjusted_close\", and \"volume\".\n", "# - The data is daily price/volume, covering at least the period from 2016-03-18 until 2021-03-18, \n", "# - Files are for the stocks PM, LMT, MCD, MMM, AAPL, MSFT, TXN, CSCO.\n", "list_stocks = [\"PM\", \"LMT\", \"MCD\", \"MMM\", \"AAPL\", \"MSFT\", \"TXN\", \"CSCO\"]\n", "list_factors = []\n", "alphaToken = None\n", " \n", "list_tickers = list_stocks + list_factors\n", "if alphaToken is not None:\n", " data_download(list_tickers, alphaToken) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Read data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load the daily stock price data from the downloaded CSV files. The data is adjusted for splits and dividends. Then a selected time period is taken from the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "investment_start = \"2016-03-18\"\n", "investment_end = \"2021-03-18\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "# The files are in \"stock_data\" folder, named as \"daily_adjusted_[TICKER].csv\"\n", "dr = DataReader(folder_path=\"stock_data\", symbol_list=list_tickers)\n", "dr.read_data()\n", "df_prices, _ = dr.get_period(start_date=investment_start, end_date=investment_end)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Run the optimization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define the optimization model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we implement the optimization model in Fusion API. We create it inside a function so we can call it later." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# |x| <= t\n", "def absval(M, x, t):\n", " M.constraint(t >= x)\n", " M.constraint(t >= -x)\n", "\n", "# ||x||_1 <= t\n", "def norm1(M, x, t):\n", " u = M.variable(x.getShape(), Domain.unbounded())\n", " absval(M, x, u)\n", " M.constraint(Expr.sum(u) == t)\n", "\n", "def EfficientFrontier(N, m, G, deltas):\n", "\n", " with Model(\"Case study\") as M:\n", " # Settings\n", " #M.setLogHandler(sys.stdout)\n", " \n", " # Variables \n", " # The variable x is the fraction of holdings relative to the initial capital. \n", " # It is a free variable, allowing long and short positions.\n", " x = M.variable(\"x\", N, Domain.unbounded())\n", " \n", " # The variable s models the portfolio variance term in the objective.\n", " s = M.variable(\"s\", 1, Domain.unbounded())\n", " \n", " # Gross exposure constraint (allows 2 times the initial capital)\n", " norm1(M, x, 2.0)\n", " \n", " # Dollar neutrality constraint\n", " M.constraint('neutrality', Expr.sum(x) == 0.0)\n", " \n", " # Objective (quadratic utility version)\n", " delta = M.parameter()\n", " M.objective('obj', ObjectiveSense.Maximize, x.T @ m - delta * s)\n", "\n", " # Conic constraint for the portfolio variance \n", " M.constraint('risk', Expr.vstack(s, 0.5, G.T @ x), Domain.inRotatedQCone())\n", " \n", " # Create DataFrame to store the results. Last security name (the SPY) is removed.\n", " columns = [\"delta\", \"obj\", \"return\", \"risk\", \"g. exp.\"] + df_prices.columns.tolist()\n", " df_result = pd.DataFrame(columns=columns)\n", " for d in deltas:\n", " # Update parameter\n", " delta.setValue(d)\n", " \n", " # Solve optimization\n", " M.solve()\n", " \n", " # Check if the solution is an optimal point\n", " solsta = M.getPrimalSolutionStatus()\n", " if (solsta != SolutionStatus.Optimal):\n", " # See https://docs.mosek.com/latest/pythonfusion/accessing-solution.html about handling solution statuses.\n", " raise Exception(\"Unexpected solution status!\") \n", " \n", " # Save results\n", " portfolio_return = m @ x.level()\n", " portfolio_risk = np.sqrt(s.level()[0])\n", " gross_exp = sum(np.absolute(x.level()))\n", " row = pd.Series([d, M.primalObjValue(), portfolio_return, portfolio_risk, gross_exp] + list(x.level()), index=columns)\n", " df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)\n", "\n", " return df_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compute optimization input variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we use the loaded daily price data to compute the corresponding yearly mean return and covariance matrix." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Number of securities\n", "N = df_prices.shape[1] \n", "\n", "# Get optimization parameters\n", "m, S = compute_inputs(df_prices)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we compute the matrix $G$ such that $\\ECov=GG^\\mathsf{T}$, this is the input of the conic form of the optimization problem. Here we use Cholesky factorization." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "G = np.linalg.cholesky(S)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Call the optimizer function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We run the optimization for a range of risk aversion parameter values: $\\delta = 10^{-1},\\dots,10^{1.5}$. We compute the efficient frontier this way both with and without using shrinkage estimation. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Compute efficient frontier with and without shrinkage\n", "deltas = np.logspace(start=-1, stop=1.5, num=20)[::-1]\n", "df_result = EfficientFrontier(N, m, G, deltas)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Check the results." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "df_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize the results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the efficient frontier." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "ax = df_result.plot(x=\"risk\", y=\"return\", style=\"-o\", \n", " xlabel=\"portfolio risk (std. dev.)\", ylabel=\"portfolio return\", grid=True) \n", "ax.legend([\"return\"]);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot the portfolio composition." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Round small values to 0 to make plotting work\n", "mask = np.absolute(df_result) < 1e-7\n", "mask.iloc[:, :-8] = False\n", "df_result[mask] = 0" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [], "source": [ "my_cmap = LinearSegmentedColormap.from_list(\"non-extreme gray\", [\"#111111\", \"#eeeeee\"], N=256, gamma=1.0)\n", "ax = df_result.set_index('risk').iloc[:, 3:].plot.area(colormap=my_cmap, xlabel='portfolio risk (std. dev.)', ylabel=\"x\") \n", "ax.grid(which='both', axis='x', linestyle=':', color='k', linewidth=1)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 2 }