{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "LaTeX macros (hidden cell)\n",
    "$\n",
    "\\newcommand{\\Q}{\\mathcal{Q}}\n",
    "\\newcommand{\\ECov}{\\boldsymbol{\\Sigma}}\n",
    "\\newcommand{\\EMean}{\\boldsymbol{\\mu}}\n",
    "\\newcommand{\\EAlpha}{\\boldsymbol{\\alpha}}\n",
    "\\newcommand{\\EBeta}{\\boldsymbol{\\beta}}\n",
    "$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Imports and configuration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "FILE=/content/portfolio_tools.py\n",
    "if [[ ! -f $FILE ]]; then\n",
    "    wget https://raw.githubusercontent.com/MOSEK/PortfolioOptimization/main/python/notebooks/portfolio_tools.py\n",
    "fi"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install mosek \n",
    "%env PYTHONPATH /env/python:/content\n",
    "%env MOSEKLM_LICENSE_FILE /content/mosek.lic:/root/mosek/mosek.lic\n",
    "\n",
    "# To execute the notebook directly in colab make sure your MOSEK license file is in one the locations\n",
    "#\n",
    "# /content/mosek.lic   or   /root/mosek/mosek.lic\n",
    "#\n",
    "# inside this notebook's internal filesystem. \n",
    "#\n",
    "# You will also need an API key from a stock data provider, or ready data files in a \"stock_data\" folder."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "import sys\n",
    "import os\n",
    "import re\n",
    "import datetime as dt\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "%matplotlib inline\n",
    "import matplotlib\n",
    "import matplotlib.pyplot as plt\n",
    "from matplotlib.colors import LinearSegmentedColormap\n",
    "\n",
    "from mosek.fusion import *\n",
    "import mosek.fusion.pythonic   # From Mosek >= 10.2\n",
    "\n",
    "from notebook.services.config import ConfigManager\n",
    "\n",
    "# portfolio_tools.py is a Mosek helper file distributed together with the notebooks\n",
    "from portfolio_tools import data_download, DataReader, compute_inputs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Version checks\n",
    "print(sys.version)\n",
    "print('matplotlib: {}'.format(matplotlib.__version__))\n",
    "\n",
    "# Jupyter configuration\n",
    "c = ConfigManager()\n",
    "c.update('notebook', {\"CodeCell\": {\"cm_config\": {\"autoCloseBrackets\": False}}})  \n",
    "\n",
    "# Numpy options\n",
    "np.set_printoptions(precision=5, linewidth=120, suppress=True)\n",
    "\n",
    "# Pandas options\n",
    "pd.set_option('display.max_rows', None)\n",
    "\n",
    "# Matplotlib options\n",
    "plt.rcParams['figure.figsize'] = [12, 8]\n",
    "plt.rcParams['figure.dpi'] = 200"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Prepare input data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we load the raw data that will be used to compute the optimization input variables, the vector $\\EMean$ of expected returns and the covariance matrix $\\ECov$. The data consists of daily stock prices of $8$ stocks from the US market. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Data downloading:\n",
    "# If the user has an API key for alphavantage.co, then this code part will download the data. \n",
    "# The code can be modified to download from other sources. To be able to run the examples, \n",
    "# and reproduce results in the cookbook, the files have to have the following format and content:\n",
    "# - File name pattern: \"daily_adjusted_[TICKER].csv\", where TICKER is the symbol of a stock. \n",
    "# - The file contains at least columns \"timestamp\", \"adjusted_close\", and \"volume\".\n",
    "# - The data is daily price/volume, covering at least the period from 2016-03-18 until 2021-03-18, \n",
    "# - Files are for the stocks PM, LMT, MCD, MMM, AAPL, MSFT, TXN, CSCO.\n",
    "list_stocks = [\"PM\", \"LMT\", \"MCD\", \"MMM\", \"AAPL\", \"MSFT\", \"TXN\", \"CSCO\"]\n",
    "list_factors = []\n",
    "alphaToken = None\n",
    " \n",
    "list_tickers = list_stocks + list_factors\n",
    "if alphaToken is not None:\n",
    "    data_download(list_tickers, alphaToken)  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Read data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We load the daily stock price data from the downloaded CSV files. The data is adjusted for splits and dividends. Then a selected time period is taken from the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "investment_start = \"2016-03-18\"\n",
    "investment_end = \"2021-03-18\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "# The files are in \"stock_data\" folder, named as \"daily_adjusted_[TICKER].csv\"\n",
    "dr = DataReader(folder_path=\"stock_data\", symbol_list=list_tickers)\n",
    "dr.read_data()\n",
    "df_prices, _ = dr.get_period(start_date=investment_start, end_date=investment_end)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Run the optimization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define the optimization model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below we implement the optimization model in Fusion API. We create it inside a function so we can call it later."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# |x| <= t\n",
    "def absval(M, x, t):\n",
    "    M.constraint(t >= x)\n",
    "    M.constraint(t >= -x)\n",
    "\n",
    "# ||x||_1 <= t\n",
    "def norm1(M, x, t):\n",
    "    u = M.variable(x.getShape(), Domain.unbounded())\n",
    "    absval(M, x, u)\n",
    "    M.constraint(Expr.sum(u) == t)\n",
    "\n",
    "def EfficientFrontier(N, m, G, deltas):\n",
    "\n",
    "    with Model(\"Case study\") as M:\n",
    "        # Settings\n",
    "        #M.setLogHandler(sys.stdout)\n",
    "        \n",
    "        # Variables \n",
    "        # The variable x is the fraction of holdings relative to the initial capital. \n",
    "        # It is a free variable, allowing long and short positions.\n",
    "        x = M.variable(\"x\", N, Domain.unbounded())\n",
    "        \n",
    "        # The variable s models the portfolio variance term in the objective.\n",
    "        s = M.variable(\"s\", 1, Domain.unbounded())\n",
    "        \n",
    "        # Gross exposure constraint (allows 2 times the initial capital)\n",
    "        norm1(M, x, 2.0)\n",
    "        \n",
    "        # Dollar neutrality constraint\n",
    "        M.constraint('neutrality', Expr.sum(x) == 0.0)\n",
    "        \n",
    "        # Objective (quadratic utility version)\n",
    "        delta = M.parameter()\n",
    "        M.objective('obj', ObjectiveSense.Maximize, x.T @ m - delta * s)\n",
    "\n",
    "        # Conic constraint for the portfolio variance \n",
    "        M.constraint('risk', Expr.vstack(s, 0.5, G.T @ x), Domain.inRotatedQCone())\n",
    "        \n",
    "        # Create DataFrame to store the results. Last security name (the SPY) is removed.\n",
    "        columns = [\"delta\", \"obj\", \"return\", \"risk\", \"g. exp.\"] + df_prices.columns.tolist()\n",
    "        df_result = pd.DataFrame(columns=columns)\n",
    "        for d in deltas:\n",
    "            # Update parameter\n",
    "            delta.setValue(d)\n",
    "            \n",
    "            # Solve optimization\n",
    "            M.solve()\n",
    "            \n",
    "            # Check if the solution is an optimal point\n",
    "            solsta = M.getPrimalSolutionStatus()\n",
    "            if (solsta != SolutionStatus.Optimal):\n",
    "                # See https://docs.mosek.com/latest/pythonfusion/accessing-solution.html about handling solution statuses.\n",
    "                raise Exception(\"Unexpected solution status!\") \n",
    "            \n",
    "            # Save results\n",
    "            portfolio_return = m @ x.level()\n",
    "            portfolio_risk = np.sqrt(s.level()[0])\n",
    "            gross_exp = sum(np.absolute(x.level()))\n",
    "            row = pd.Series([d, M.primalObjValue(), portfolio_return, portfolio_risk, gross_exp] + list(x.level()), index=columns)\n",
    "            df_result = pd.concat([df_result, pd.DataFrame([row])], ignore_index=True)\n",
    "\n",
    "        return df_result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compute optimization input variables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we use the loaded daily price data to compute the corresponding yearly mean return and covariance matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Number of securities\n",
    "N = df_prices.shape[1]  \n",
    "\n",
    "# Get optimization parameters\n",
    "m, S = compute_inputs(df_prices)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we compute the matrix $G$ such that $\\ECov=GG^\\mathsf{T}$, this is the input of the conic form of the optimization problem. Here we use Cholesky factorization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "G = np.linalg.cholesky(S)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Call the optimizer function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We run the optimization for a range of risk aversion parameter values: $\\delta = 10^{-1},\\dots,10^{1.5}$. We compute the efficient frontier this way both with and without using shrinkage estimation. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "# Compute efficient frontier with and without shrinkage\n",
    "deltas = np.logspace(start=-1, stop=1.5, num=20)[::-1]\n",
    "df_result = EfficientFrontier(N, m, G, deltas)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "df_result"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visualize the results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the efficient frontier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "ax = df_result.plot(x=\"risk\", y=\"return\", style=\"-o\", \n",
    "                    xlabel=\"portfolio risk (std. dev.)\", ylabel=\"portfolio return\", grid=True)   \n",
    "ax.legend([\"return\"]);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot the portfolio composition."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Round small values to 0 to make plotting work\n",
    "mask = np.absolute(df_result) < 1e-7\n",
    "mask.iloc[:, :-8] = False\n",
    "df_result[mask] = 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": false
   },
   "outputs": [],
   "source": [
    "my_cmap = LinearSegmentedColormap.from_list(\"non-extreme gray\", [\"#111111\", \"#eeeeee\"], N=256, gamma=1.0)\n",
    "ax = df_result.set_index('risk').iloc[:, 3:].plot.area(colormap=my_cmap, xlabel='portfolio risk (std. dev.)', ylabel=\"x\") \n",
    "ax.grid(which='both', axis='x', linestyle=':', color='k', linewidth=1)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}