{ "metadata": { "name": "", "signature": "sha256:b8fb047ea9e73ce1c330f504d1a11a3ac763f26784cb2ce55d7a4b7fbbc52cc9" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 6, "metadata": {}, "source": [ "The latest version of this IPython notebook is available at [http://github.com/jckantor/ESTM60203](http://github.com/jckantor/ESTM60203) for noncommercial use under terms of the [Creative Commons Attribution Noncommericial ShareAlike License (CC BY-NC-SA 4.0)](http://creativecommons.org/licenses/by-nc-sa/4.0/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "J.C. Kantor (Kantor.1@nd.edu)" ] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Portfolio Optimization using Mean Absolute Deviation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This [IPython notebook](http://ipython.org/notebook.html) demonstrates portfolio optimization using the Mean Absolute Deviation (MAD) criterion. A portion of these notes is adapted from [GLPK Wikibook tutorial on the subject](http://en.wikibooks.org/wiki/GLPK/Portfolio_Optimization) written by me." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Initializations" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from IPython.core.display import HTML\n", "HTML(open(\"styles/custom.css\", \"r\").read())" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ], "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "" ] } ], "prompt_number": 1 }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Background" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Konno and Yamazaki (1990) proposed a linear programming model for portfolio optimization in which the risk measure is mean absolute deviation (MAD). This model computes a portfolio minimizing MAD subject to a lower bound on return.\n", "\n", "In contrast to the classical Markowitz portfolio, the MAD criterion requires a data set consisting of returns on the investment assets. The data set may be an historical record or samples from a multivariate statistical model of portfolio returns. The MAD criterion produces portfolios with properties not shared by the Markowitz portfolio, including second degree stochastic dominance.\n", "\n", "Below we demonstrate portfolio optimization with the MAD criterion where data is generated by sampling a multivariate normal distribution. Given mean return $r$ and the Cholesky decomposition of the covariance matrix $\\Sigma$ (i.e., $C$ such that $CC^T = \\Sigma$), we compute $r_t = r + C z_t$ where the elements of $z_t$ are zero mean normal variates with unit variance.\n", "\n", "The rest of the formulation is adapted from \"Optimization Methods in Finance\" by Gerald Curnuejols and Reha Tutuncu (2007) which, in turn, follows an implementation due to Fienstein and Thapa (1993). \n", "\n", "[A complete tutorial](http://en.wikibooks.org/wiki/GLPK/Portfolio_Optimization) describing the implementation of this model is available on the [GLPK wikibook](http://en.wikibooks.org/wiki/GLPK/).\n" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Mean Absolute Deviation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Portfolio optimization refers to the process of allocating capital among a set of financial assets to achieve a desired tradeoff between risk and return. The classical Markowitz approach to this problem measures risk using the expected variance of the portfolio return. This criterion yields a quadratic program for the relative weights of assets in the optimal portfolio. \n", "\n", "In 1991, Konno and Yamazaki {{Cite journal | last1 = Konno | first1 = Hiroshi | last2 = Yamazaki | first2 = Hiroaki | title = Mean-absolute deviation portfolio optimization model and its applications to Tokyo stock market | journal = Management Science | volume = 37 | pages = 519-531 | date = 1991}} proposed a linear programming model for portfolio optimization whereby risk is measured by the mean absolute deviation (MAD) from the expected return. Using MAD as the risk metric produces portfolios with several desirable properties not shared by the Markowitz portfolio, including [[w:Stochastic_dominance|second order stochastic dominance]].\n", "\n", "As originally formulated by Konno and Yamazaki, one starts with a history of returns $R_i(t_n)$ for every asset in a set $S$ of assets. The return at time $t_n$ is determined by the change in price of the asset, \n", "\n", "$$R_i(t_n)= {(P_i(t_n)-P_i(t_{n-1}))}/{P_i(t_{n-1})}$$\n", "\n", "For each asset, the expected return is estimated by \n", "\n", "$$\\bar{R}_i \\approx \\frac{1}{N}\\sum_{n=1}^NR_i(t_n)$$\n", "\n", "The investor specifies a minimum required return $R_{p}$. The portfolio optimization problem is to determine the fraction of the total investment allocated to each asset, $w_i$, that minimizes the mean absolution deviation from the mean\n", "\n", "$$\\min_{w_i} \\frac{1}{N}\\sum_{n=1}^N\\lvert\\sum_{i \\in S} w_i(R_i(t_n)-\\bar{R}_i)\\rvert$$\n", "\n", "subject to the required return and a fixed total investment:\n", "\n", "$$\n", "\\begin{array}{rcl}\n", "\\sum_{i \\in S} w_i\\bar{R}_i & \\geq & R_{p} \\\\\n", "\\quad \\\\\n", "\\sum_{i \\in S} w_i & = & 1\n", "\\end{array}\n", "$$\n", "\n", "The value of the minimum required return, $R_p$ expresses the investor's risk tolerance. A smaller value for $R_p$ increases the feasible solution space, resulting in portfolios with lower values of the MAD risk metric. Increasing $R_p$ results in portfolios with higher risk. The relationship between risk and return is a fundamental principle of investment theory. \n", "\n", "This formulation doesn't place individual bounds on the weights $w_i$. In particular, $w_i < 0$ corresponds to short selling of the associated asset. Constraints can be added if the investor imposes limits on short-selling, on the maximum amount that can be invested in a single asset, or other requirements for portfolio diversification. Depending on the set of available investment opportunities, additional constraints can lead to infeasible investment problems." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Reformulation of the MAD Objective" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following formulation of the objective function is adapted from \"Optimization Methods in Finance\" by Gerald Curnuejols and Reha Tutuncu (2007) {{Cite book | last1 = Curnuejols | first1 = Gerald | last2 = Tutuncu | first2 = Reha | title = Optimization Methods in Finance | publisher = Cambridge University Press | date = 2007}} which, in turn, follows\n", "Feinstein and Thapa (1993) {{Cite journal | last1 = Feinstein | first1 = Charles D. | last2 = Thapa | first2 = Mukund N. | title = A Reformulation of a Mean-Absolute Deviation Portfolio Optimization Model | journal = Management Science | volume = 39 | pages = 1552-1553 | date = 1993}}. The model is streamlined by introducing decision variables $y_n \\geq 0$ and $z_n \\geq 0$, $n=1,\\ldots,N$ so that the objective becomes\n", "\n", "$$\\min_{w_i, y_n,z_n} \\frac{1}{N}\\sum_{n = 1}^{N} (y_n+z_n)$$\n", "\n", "subject to constraints\n", "\n", "$$\n", "\\begin{array}{rcl}\n", "\\sum_{i \\in S} w_i\\bar{R}_i & \\geq & R_{p} \\\\ \\\\\n", "\\sum_{i \\in S} w_i & = & 1 \\\\ \\\\\n", "y_n-z_n & = & \\sum_{i \\in S} w_i(R_i(t_n)-\\bar{R}_i) \\quad \\mbox{for}\\quad n = 1,\\ldots,N\n", "\\end{array}\n", "$$\n", "\n", "As discussed by Feinstein and Thapa, this version reduces the problem to $N+2$ constraints in $ 2N+\\mbox{card}(S)$ decision variables, noting the `card` function." ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Seeding the GLPK Pseudo-Random Number Generator" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately, MathProg does not provide a method to seed the pseudo-random number generator in GLPK. Instead, the following GMPL code fragment uses the function gmtime() to find the number of seconds since 1970. Dropping the leading digit avoids subsequent overflow errors, and the square returns a number with big changes every second. Extracting the lowest digits produces a number between 0 and 100,000 that determines how many times to sample GLPK's pseudo-random number generator prior to subsequent use. This hack comes with no assurances regarding its statistical properties.\n", "\n", "\n", " /* A workaround for the lack of a way to seed the PRNG in GMPL */\n", " param utc := prod {1..2} (gmtime()-1000000000);\n", " param seed := utc - 100000*floor(utc/100000);\n", " check sum{1..seed} Uniform01() > 0;\n" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Simulation of the Historical Returns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this implementation, historical returns are simulated assuming knowledge of the means and covariances of asset returns. We begin with a vector of mean returns $\\bar{R}$ and covariance matrix $\\Sigma$ estimated by\n", "\n", "$$\n", "\\Sigma_{ij} \\approx \\frac{1}{N-1}\\sum_{n=1}^N(R_i(t_n)-\\bar{R}_i)(R_j(t_n)-\\bar{R}_j)\n", "$$\n", "\n", "Simulation of historical returns requires generation of samples from a multivariable normal distribution. For this purpose we compute the Cholesky factorization where, for a positive semi-definite $\\Sigma$, $\\Sigma=CC^T$ and $C$ is a lower triangular matrix. The following MathProg code fragment implements the Cholesky-Crout algorithm. \n", "\n", "\n", " /* Cholesky Lower Triangular Decomposition of the Covariance Matrix */\n", " param C{i in S, j in S : i >= j} :=\n", " if i = j then\n", " sqrt(Sigma[i,i]-(sum {k in S : k < i} (C[i,k]*C[i,k])))\n", " else\n", " (Sigma[i,j]-sum{k in S : k < j} C[i,k]*C[j,k])/C[j,j];\n", "\n", "\n", "Without error checking, this code fragment fails unless $\\Sigma$ is positive definite. The covariance matrix is normally positive definite for real-world data, so this is generally not an issue. However, it does become an issue if one attempts to include a risk-free asset, such as a government bond, in the set of investment assets.\n", "\n", "Once the Cholesky factor $C$ has been computed, a vector of simulated returns $R(t_n)$ is given by $R(t_n) = \\bar{R} + C Z(t_n)$ where the elements of $Z(t_n)$ are independent samples from a normal distribution with zero mean and unit variance.\n", "\n", " /* Simulated returns */\n", " param N default 5000;\n", " set T := 1..N;\n", " param R{i in S, t in T} := Rbar[i] + sum {j in S : j <= i} C[i,j]*Normal(0,1);\n" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "MathProg Model" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%script glpsol -m /dev/stdin\n", "\n", "# Example: PortfolioMAD.mod Portfolio Optimization using Mean Absolute Deviation\n", "\n", "/* Stock Data */\n", "\n", "set S; # Set of stocks\n", "param r{S}; # Means of projected returns\n", "param cov{S,S}; # Covariance of projected returns\n", "param r_portfolio\n", " default (1/card(S))*sum{i in S} r[i]; # Lower bound on portfolio return\n", "\n", "/* Generate sample data */\n", "\n", "/* Cholesky Lower Triangular Decomposition of the Covariance Matrix */\n", "param c{i in S, j in S : i >= j} := \n", " if i = j then\n", " sqrt(cov[i,i]-(sum {k in S : k < i} (c[i,k]*c[i,k])))\n", " else\n", " (cov[i,j]-sum{k in S : k < j} c[i,k]*c[j,k])/c[j,j];\n", "\n", "/* Because there is no way to seed the PRNG, a workaround */\n", "param utc := prod {1..2} (gmtime()-1000000000);\n", "param seed := utc - 100000*floor(utc/100000);\n", "check sum{1..seed} Uniform01() > 0;\n", "\n", "/* Normal random variates */\n", "param N default 5000;\n", "set T := 1..N;\n", "param zn{j in S, t in T} := Normal(0,1);\n", "param rt{i in S, t in T} := r[i] + sum {j in S : j <= i} c[i,j]*zn[j,t];\n", "\n", "/* MAD Optimization */\n", "\n", "var w{S} >= 0; # Portfolio Weights with Bounds\n", "var y{T} >= 0; # Positive deviations (non-negative)\n", "var z{T} >= 0; # Negative deviations (non-negative)\n", "\n", "minimize MAD: (1/card(T))*sum {t in T} (y[t] + z[t]);\n", "\n", "s.t. C1: sum {s in S} w[s]*r[s] >= r_portfolio;\n", "s.t. C2: sum {s in S} w[s] = 1;\n", "s.t. C3 {t in T}: (y[t] - z[t]) = sum{s in S} (rt[s,t]-r[s])*w[s];\n", "\n", "solve;\n", "\n", "/* Report */\n", "\n", "/* Input Data */\n", "printf \"Stock Data\\n\\n\";\n", "printf \" Return Variance\\n\";\n", "printf {i in S} \"%5s %7.4f %7.4f\\n\", i, r[i], cov[i,i];\n", "\n", "printf \"\\nCovariance Matrix\\n\\n\";\n", "printf \" \";\n", "printf {j in S} \" %7s \", j;\n", "printf \"\\n\";\n", "for {i in S} {\n", " printf \"%5s \" ,i;\n", " printf {j in S} \" %7.4f \", cov[i,j];\n", " printf \"\\n\";\n", "}\n", "\n", "/* MAD Optimal Portfolio */\n", "printf \"\\nMinimum Absolute Deviation (MAD) Portfolio\\n\\n\";\n", "printf \" Return = %7.4f\\n\",r_portfolio;\n", "printf \" Variance = %7.4f\\n\\n\", sum {i in S, j in S} w[i]*w[j]*cov[i,j];\n", "printf \" Weight\\n\";\n", "printf {s in S} \"%5s %7.4f\\n\", s, w[s];\n", "printf \"\\n\";\n", "\n", "table tab0 {s in S} OUT \"JSON\" \"Optimal Portfolio\" \"PieChart\": \n", " s, w[s]~PortfolioWeight;\n", " \n", "table tab1 {s in S} OUT \"JSON\" \"Asset Return versus Volatility\" \"ScatterChart\":\n", " sqrt(cov[s,s])~StDev, r[s]~Return;\n", " \n", "table tab2 {s in S} OUT \"JSON\" \"Portfolio Weights\" \"ColumnChart\": \n", " s~Stock, w[s]~PortfolioWeight;\n", " \n", "table tab3 {t in T} OUT \"JSON\" \"Simulated Portfolio Return\" \"LineChart\": \n", " t~month, (y[t] - z[t])~PortfolioReturn;\n", "\n", "/* Simulated Return data in Matlab Format */\n", "/*\n", "printf \"\\nrt = [ ... \\n\";\n", "for {t in T} {\n", " printf {s in S} \"%9.4f\",rt[s,t];\n", " printf \"; ...\\n\";\n", "}\n", "printf \"];\\n\\n\";\n", "*/\n", "\n", "data;\n", "\n", "/* Data for monthly returns on four selected stocks for a three\n", "year period ending December 4, 2009 */\n", "\n", "param N := 200;\n", "\n", "param r_portfolio := 0.01;\n", "\n", "param : S : r :=\n", " AAPL 0.0308\n", " GE -0.0120\n", " GS 0.0027\n", " XOM 0.0018 ;\n", "\n", "param cov : \n", " AAPL GE GS XOM :=\n", " AAPL 0.0158 0.0062 0.0088 0.0022\n", " GE 0.0062 0.0136 0.0064 0.0011\n", " GS 0.0088 0.0064 0.0135 0.0008\n", " XOM 0.0022 0.0011 0.0008 0.0022 ;\n", "\n", "end;" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "GLPSOL: GLPK LP/MIP Solver, v4.52\n", "Parameter(s) specified in the command line:\n", " -m /dev/stdin\n", "Reading model section from /dev/stdin...\n", "Reading data section from /dev/stdin...\n", "/dev/stdin:115: warning: final NL missing before end of file\n", "115 lines were read\n", "Checking (line 24)...\n", "Generating MAD...\n", "Generating C1...\n", "Generating C2...\n", "Generating C3...\n", "Model has been successfully generated\n", "GLPK Simplex Optimizer, v4.52\n", "203 rows, 404 columns, 1608 non-zeros\n", "Preprocessing...\n", "140 rows, 142 columns, 698 non-zeros\n", "Scaling...\n", " A: min|aij| = 2.301e-04 max|aij| = 1.000e+00 ratio = 4.346e+03\n", "GM: min|aij| = 9.139e-02 max|aij| = 1.094e+01 ratio = 1.197e+02\n", "EQ: min|aij| = 8.625e-03 max|aij| = 1.000e+00 ratio = 1.159e+02\n", "Constructing initial basis...\n", "Size of triangular part is 140\n", " 0: obj = 1.502038445e-02 infeas = 7.709e+00 (0)\n", "* 116: obj = 6.534903273e-02 infeas = 0.000e+00 (0)\n", "* 157: obj = 4.760242137e-02 infeas = 0.000e+00 (0)\n", "OPTIMAL LP SOLUTION FOUND\n", "Time used: 0.0 secs\n", "Memory used: 1.0 Mb (1076387 bytes)\n", "Stock Data\n", "\n", " Return Variance\n", " AAPL 0.0308 0.0158\n", " GE -0.0120 0.0136\n", " GS 0.0027 0.0135\n", " XOM 0.0018 0.0022\n", "\n", "Covariance Matrix\n", "\n", " AAPL GE GS XOM \n", " AAPL 0.0158 0.0062 0.0088 0.0022 \n", " GE 0.0062 0.0136 0.0064 0.0011 \n", " GS 0.0088 0.0064 0.0135 0.0008 \n", " XOM 0.0022 0.0011 0.0008 0.0022 \n", "\n", "Minimum Absolute Deviation (MAD) Portfolio\n", "\n", " Return = 0.0100\n", " Variance = 0.0033\n", "\n", " Weight\n", " AAPL 0.2828\n", " GE 0.0000\n", " GS 0.0000\n", " XOM 0.7172\n", "\n", "Writing tab0...\n", "Invalid table driver `JSON'\n", "/dev/stdin:71: error on opening table tab0\n", "Model postsolving error\n" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }