{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Introduction to Monte-Carlo Methods\n", "\n", "## Rohan L. Fernando\n", "\n", "## November 2015" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Mean and Variance of Truncated Normal\n", "\n", "\n", "Suppose $Y \\sim N(\\mu_Y,V_Y)$.\n", "\n", "The mean and variance of $Y$ given truncation\n", "selection are:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\\begin{equation}\n", "E(Y|Y>t) = \\mu_Y + V_Y^{1/2}i\n", "\\end{equation}\n", "\n", "where\n", "$$\n", "i = \\frac{f(s)}{p}\n", "$$\n", "$f(s)$ is the standard normal density function\n", "$$\n", "s = \\frac{t - \\mu_Y}{V_Y^{1/2}}\n", "$$\n", "$$\n", "p = \\Pr(Y > t)\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "-" } }, "source": [ "\\begin{equation}\n", "Var(Y|Y>t) = V_Y[1 - i(i-s)]\n", "\\end{equation}" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Proof:\n", "\n", "Start with mean and variance for a standard normal variable given truncation selection.\n", "\n", "Let $Z \\sim N(0,1)$. \n", "\n", "The density function of $Z$ is:\n", "$$\n", "f(z) = \\sqrt{\\frac{1}{2\\pi}}e^{-\\frac{1}{2}z^2}\n", "$$\n", "\n", "The density function for $Z$ given truncation selection is \n", "$$\n", "f(z|z>s) = f(z)/p\n", "$$\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From the definition of the mean:\n", "\\begin{equation*}\n", "\\begin{split}\n", "E(Z|Z>s) &= \\frac{1}{p} \\int_s^{\\infty} z f(z)dz\\\\\n", " &= \\frac{1}{p} [-f(z) ] _s^{\\infty} \\\\\n", " &= \\frac{f(s)}{p} \\\\\n", " &= i\n", "\\end{split}\n", "\\end{equation*}\n", "\n", "because the first derivative of $f(z)$ with respect to $z$ is:\n", "\\begin{equation*}\n", "\\begin{split}\n", "\\frac{d}{dz} f(z) &= \\sqrt{\\frac{1}{2\\pi} } e^{-\\frac{1}{2}z^2} (-z)\\\\\n", " &= -zf(z)\n", "\\end{split}\n", "\\end{equation*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, to compute the variance of $Z$ given selection, consider the following\n", "identity:\n", "\\begin{equation*}\n", "\\begin{split}\n", "\\frac{d}{dz} z f(z) &= f(z) + z \\frac{d}{dz} f(z) \\\\\n", " &= f(z) - z^2 f(z) \n", "\\end{split}\n", "\\end{equation*}\n", "\n", "Integrating both sides from $s$ to $\\infty$ gives\n", "\\begin{equation*}\n", "zf(z)]_s^\\infty = \\int_s^\\infty f(z) dz - \\int_s^\\infty z^2 f(z)dz\n", "\\end{equation*}\n", "Upon rearranging this gives:\n", "\\begin{equation*}\n", "\\begin{split}\n", " \\int_s^\\infty z^2 f(z)dz &= \\int_s^\\infty f(z) dz - zf(z)]_s^\\infty \\\\\n", "\\frac{1}{p} \\int_s^\\infty z^2 f(z)dz &= \n", " \\frac{1}{p} \\int_s^\\infty f(z) dz + \\frac{f(s)}{p}s\\\\\n", " &= 1 + is\n", "\\end{split}\n", "\\end{equation*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, \n", "\\begin{equation}\n", " \\begin{split}\n", " Var(Z|Z>s) &= E(Z^2|Z>s) - [E(Z|Z>s)]^2\\\\\n", " &= 1 + is - i^2 \\\\\n", " &= 1 - i(i-s)\n", " \\end{split}\n", "\\end{equation}\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Results for $Y$\n", "\n", "Results for $Y$ follow from the fact that \n", "$$\n", "\\mu_Y + V_Y^{1/2}Z \\sim N(\\mu_Y,V_Y)\n", "$$\n", "\n", "So, let \n", "$$\n", "Y = \\mu_Y + V_Y^{1/2}Z,\n", "$$\n", "Then, the condition\n", "$$\n", "Y>t\n", "$$\n", "is equivalent to \n", "\\begin{equation*}\n", " \\begin{split}\n", " \\mu_Y + V_Y^{1/2}Z &> t \\\\\n", " V_Y^{1/2}Z &> t - \\mu_Y \\\\\n", " Z &> \\frac{t - \\mu_Y}{V_Y^{1/2}}\\\\\n", " Z &> s\n", " \\end{split} \n", "\\end{equation*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, \n", "\\begin{equation*}\n", " \\begin{split}\n", " E(Y|Y>t) &= E(\\mu_Y + V_Y^{1/2}Z |Z>s) \\\\\n", " &= \\mu_Y + V_Y^{1/2}i,\n", " \\end{split}\n", "\\end{equation*}\n", "and\n", "\\begin{equation*}\n", " \\begin{split} \n", " Var(Y|Y>t) &= Var(\\mu_Y + V_Y^{1/2}Z |Z>s) \\\\\n", " &= V_Y[1 - i(i-s)]\n", " \\end{split}\n", "\\end{equation*}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Numerical Example" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mean = 21.41 \n", "variance = 26.85 \n" ] } ], "source": [ "using Distributions \n", "μ = 10\n", "σ = 10\n", "t = 15\n", "s = (t-μ)/σ\n", "d = Normal(0.0,1.0)\n", "i = pdf(d,s)/(1-cdf(d,s))\n", "meanTruncatedNormal = μ + σ*i\n", "variTruncatedNormal = σ*σ*(1 - i*(i-s))\n", "@printf \"mean = %8.2f \\n\" meanTruncatedNormal\n", "@printf \"variance = %8.2f \\n\" variTruncatedNormal" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Monte-Carlo Approach:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [], "source": [ "μ = 10\n", "σ = 10\n", "z = rand(Normal(μ,σ),100000);" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "MC mean = 21.38 \n", "MC variance = 26.79 \n" ] } ], "source": [ "mcmcMean = mean(z[z.>t])\n", "mcmcVar = var(z[z.>t])\n", "@printf \"MC mean = %8.2f \\n\" mcmcMean\n", "@printf \"MC variance = %8.2f \\n\" mcmcVar" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bivariate Normal Example\n", "\n", "Let $\\mathbf(Y) \\sim N(\\mathbf{\\mu},\\mathbf{V})$\n", "\n", "$\n", "\\mathbf{\\mu} = \n", "\\begin{bmatrix}\n", "10\\\\\n", "20\n", "\\end{bmatrix},\n", "$\n", "$\n", "\\mathbf{V} = \n", "\\begin{bmatrix}\n", "100 & 50\\\\\n", "50 & 200\n", "\\end{bmatrix}\n", "$\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "10000x2 Array{Float64,2}:\n", " 11.9285 19.1144 \n", " 19.2509 36.145 \n", " 4.45011 23.9638 \n", " 20.9049 36.2345 \n", " 14.7587 29.454 \n", " 7.85305 29.5753 \n", " 13.3294 25.8031 \n", " 17.942 21.6667 \n", " 2.00279 20.1188 \n", " 18.2081 29.531 \n", " 22.5693 9.33006\n", " -6.57064 1.51851\n", " 8.77239 28.1197 \n", " ⋮ \n", " 18.5498 51.7603 \n", " 33.5773 27.9552 \n", " 18.3895 26.4949 \n", " 24.3136 46.4041 \n", " 6.92297 6.27365\n", " 18.2313 22.6488 \n", " -8.52665 28.2996 \n", " 20.986 16.3011 \n", " -0.377664 24.9399 \n", " 24.0369 7.16565\n", " 12.1059 21.147 \n", " 7.17966 23.7316 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "μ = [10.0;20.0]\n", "V = [100.0 50.0\n", " 50.0 200.0]\n", "d = MvNormal(μ,V)\n", "XY = rand(d,10000)'" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "10000-element BitArray{1}:\n", " true\n", " true\n", " false\n", " true\n", " true\n", " false\n", " true\n", " true\n", " false\n", " true\n", " true\n", " false\n", " false\n", " ⋮\n", " true\n", " true\n", " true\n", " true\n", " false\n", " true\n", " false\n", " true\n", " false\n", " true\n", " true\n", " false" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sel = XY[:,1].>10" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "5007-element Array{Float64,1}:\n", " 19.1144 \n", " 36.145 \n", " 36.2345 \n", " 29.454 \n", " 25.8031 \n", " 21.6667 \n", " 29.531 \n", " 9.33006\n", " 25.2757 \n", " 52.031 \n", " 34.1789 \n", " 58.6182 \n", " 10.67 \n", " ⋮ \n", " 23.9181 \n", " 36.876 \n", " 25.7362 \n", " 17.0781 \n", " 51.7603 \n", " 27.9552 \n", " 26.4949 \n", " 46.4041 \n", " 22.6488 \n", " 16.3011 \n", " 7.16565\n", " 21.147 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "selY = XY[sel,2]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "39.060176445384066" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mean(selY[selY.>30])" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "53.619263788780245" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "var(selY[selY.>30])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Markov Chain Monte-Carlo Methods\n", "================================\n", "\n", "- Often no closed form for\n", " $f(\\mathbf{\\theta}|\\mathbf{y})$\n", "\n", "- Further, even if computing\n", " $f(\\theta|{\\mathbf y})$\n", " is feasible, obtaining\n", " $f(\\theta_{i}|{\\mathbf y})$ would require\n", " integrating over many dimensions\n", "\n", "- Thus, in many situations, inferences are made using the empirical\n", " posterior constructed by drawing samples from\n", " $f({\\mathbf \\theta}|{\\mathbf y})$\n", "\n", "- Gibbs sampler is widely used for drawing samples from posteriors" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Gibbs Sampler\n", "-------------\n", "\n", "- Want to draw samples from $f(x_{1},x_{2},\\ldots,x_{n})$\n", "\n", "- Even though it may be possible to compute\n", " $f(x_{1},x_{2},\\ldots,x_{n})$, it is difficult to draw samples\n", " directly from $f(x_{1},x_{2},\\ldots,x_{n})$\n", "\n", "- Gibbs:\n", "\n", " - Get valid a starting point $\\mathbf{x}^{0}$\n", "\n", " - Draw sample $\\mathbf{x}^{t}$ as:\n", " $$\\begin{matrix}x_{1}^{t} & \\text{from} & f(x_{1}|x_{2}^{t-1},x_{3}^{t-1},\\ldots,x_{n}^{t-1})\\\\\n", " x_{2}^{t} & \\text{from} & f(x_{2}|x_{1}^{t},x_{3}^{t-1},\\ldots,x_{n}^{t-1})\\\\\n", " x_{3}^{t} & \\text{from} & f(x_{3}|x_{1}^{t},x_{2}^{t},\\ldots,x_{n}^{t-1})\\\\\n", " \\vdots & & \\vdots\\\\\n", " x_{n}^{t} & \\text{from} & f(x_{n}|x_{1}^{t},x_{2}^{t},\\ldots,x_{n-1}^{t})\n", " \\end{matrix}$$\n", "\n", "- The sequence\n", " ${\\mathbf x}^{1},{\\mathbf x}^{2},\\ldots,{\\mathbf x}^{n}$\n", " is a Markov chain with stationary distribution\n", " $f(x_{1},x_{2},\\ldots,x_{n})$\n", "\n", "Making Inferences from Markov Chain\n", "-----------------------------------\n", "\n", "Can show that samples obtained from a Markov chain can be\n", "used to draw inferences from $f(x_{1},x_{2},\\ldots,x_{n})$ provided the\n", "chain is:\n", "\n", "- Irreducible: can move from any state $i$ to any other\n", " state $j$\n", "\n", "- Positive recurrent: return time to any state has finite\n", " expectation\n", "\n", "- *Markov Chains*, J. R. Norris (1997)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bivariate Normal Example\n", "\n", "Let $f(\\mathbf{x})$ be a bivariate normal density with\n", " means\n", "$$\n", "\\mu' =\n", "\\begin{bmatrix}\n", " 1 & 2\n", "\\end{bmatrix}\n", "$$\n", "and covariance matrix\n", "$$\n", "\\mathbf{V} =\n", "\\begin{bmatrix}\n", " 1 & 0.5\\\\\n", "0.5& 2.0\n", "\\end{bmatrix}\n", "$$\n", "\n", "\n", "Suppose we do not know how to draw samples from $f(\\mathbf{x})$, but know how\n", "to draw samples from $f(x_i|x_j)$, which is univariate normal with mean:\n", "$$\n", "\\mu_{i.j} = \\mu_i + \\frac{v_{ij}}{v_{jj}}(x_j - \\mu_j)\n", "$$\n", "and variance\n", "$$\n", "v_{i.j} = v_{ii} - \\frac{v^2_{ij}}{v_{jj}}\n", "$$" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1000 1.00 1.96 \n", " 2000 1.02 2.01 \n", " 3000 1.01 2.00 \n", " 4000 1.02 2.02 \n", " 5000 1.01 2.01 \n", " 6000 1.01 2.01 \n", " 7000 1.01 2.02 \n", " 8000 1.00 2.01 \n", " 9000 1.00 2.00 \n", " 10000 1.00 2.00 \n", " 11000 1.00 2.01 \n", " 12000 1.00 2.00 \n", " 13000 1.01 2.00 \n", " 14000 1.01 2.00 \n", " 15000 1.01 2.00 \n", " 16000 1.00 1.99 \n", " 17000 1.00 1.99 \n", " 18000 1.00 1.99 \n", " 19000 1.00 1.99 \n", " 20000 1.00 2.00 \n" ] } ], "source": [ "m = fill(0,2)\n", "nSamples = 20000\n", "m = [1.0, 2.0]\n", "v = [1.0 0.5; 0.5 2.0]\n", "y = fill(0.0,2)\n", "save = fill(0.0,2,nSamples)\n", "sum = fill(0.0,2)\n", "s12 = sqrt( v[1,1] - v[1,2]*v[1,2]/v[2,2])\n", "s21 = sqrt(v[2,2] - v[1,2]*v[1,2]/v[1,1])\n", "m1 = 0\n", "m2 = 0;\n", "for (iter in 1:nSamples)\n", " m12 = m[1] + v[1,2]/v[2,2]*(y[2] - m[2])\n", " y[1] = rand(Normal(m12,s12),1)[1]\n", " m21 = m[2] + v[1,2]/v[1,1]*(y[1] - m[1])\n", " y[2] = rand(Normal(m21,s21),1)[1]\n", " sum += y\n", " save[:,iter] = y \n", " mean = sum/iter\n", " if iter%1000 == 0 \n", " @printf \"%10d %8.2f %8.2f \\n\" iter mean[1] mean[2]\n", " end\n", "end" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "2x2 Array{Float64,2}:\n", " 1.00471 0.499934\n", " 0.499934 1.97734 " ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cov(save',)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Metropolis-Hastings Algorithm\n", "\n", "* Sometimes may not be able to draw samples directly from $f(x_i|\\mathbf{x}_{i\\_})$ \n", "\n", "* Convergence of the Gibbs sampler may be too slow\n", "\n", "* Metropolis-Hastings (MH) for sampling from $f(\\mathbf{x})$: \n", "\n", "\t" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "* a candidate sample, $y$, is drawn from a proposal distribution $q(y|x^{t-1})$\n", "\n", "\t$$\n", "\tx^t = \\begin{cases}\n", "\t y & \\text{with probability}\\, \\alpha \\\\\n", " x^{t-1} & \\text{with probability}\\, 1 - \\alpha \\\\ \n", "\t\t \\end{cases}\n", "\t$$\n", "\t\n", "$$ \\alpha = \\min(1,\\frac{f(y)q(x^{t-1}|y)}{f(x^{t-1})q(y|x^{t-1})}) $$\n", " \n", " \n", "* The samples from MH is a Markov chain with stationary distribution $f(x)$ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bivariate Normal Example" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 1000 0.93 2.14 \n", " 2000 0.96 2.07 \n", " 3000 0.94 2.06 \n", " 4000 0.96 2.07 \n", " 5000 0.94 2.04 \n", " 6000 0.94 2.02 \n", " 7000 0.93 2.01 \n", " 8000 0.95 2.01 \n", " 9000 0.97 2.00 \n", " 10000 0.97 1.98 \n" ] } ], "source": [ "nSamples = 10000\n", "m = [1.0, 2.0]\n", "v = [1.0 0.5; 0.5 2.0]\n", "vi = inv(v)\n", "y = fill(0.0,2)\n", "sum = fill(0.0,2)\n", "\n", "m1 = 0\n", "m2 = 0\n", "xx = 0\n", "y1 = 0\n", "delta = 1.0\n", "min1 = -delta*sqrt(v[1,1])\n", "max1 = +delta*sqrt(v[1,1])\n", "min2 = -delta*sqrt(v[2,2])\n", "max2 = +delta*sqrt(v[2,2])\n", "z = y-m\n", "denOld = exp(-0.5*z'*vi*z)\n", "d1 = Uniform(min1,max1)\n", "d2 = Uniform(min2,max2)\n", "ynew = fill(0.0,2);\n", "for (iter in 1:nSamples)\n", " ynew[1] = y[1] + rand(d1,1)[1]\n", " ynew[2] = y[2] + rand(d2,1)[1]\n", " denNew = exp(-0.5*(ynew-m)'*vi*(ynew-m));\n", " alpha = denNew/denOld;\n", " u = rand()\n", " if (u < alpha[1]) \n", " y = copy(ynew)\n", " \t\tdenOld = exp(-0.5*(y-m)'*vi*(y-m)) \n", " end\n", " sum += y\n", " mean = sum/iter\n", " if iter%1000 == 0 \n", " @printf \"%10d %8.2f %8.2f \\n\" iter mean[1] mean[2]\n", " end\n", "end" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 0.4.0", "language": "julia", "name": "julia-0.4" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.4.0" } }, "nbformat": 4, "nbformat_minor": 0 }