{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Sampling from a multivariate normal\n",
    "\n",
    "To be able to tune the Gibbs sampler for a multivariate normal likelihood and multivariate normal prior, it helps to go back and look at the sampler.  Eventually sampling from a multivariate normal works its way down to a function in the `mvnormal.jl` source file for the [Distributions](https://github.com/JuliaStats/Distributions) package.\n",
    "\n",
    "That function is a one-liner\n",
    "```julia\n",
    "_rand!(d::MvNormal, x::VecOrMat{Float64}) = add!(unwhiten!(d.Σ, randn!(x)), d.μ)\n",
    "```\n",
    "\n",
    "The underscore at the beginning of the name indicates that you shouldn't expect to call this function directly.  The trailing `!` on the function names indicates that these functions are _mutating functions_.  That is, they modify one or more of their arguments.\n",
    "\n",
    "This is a common idiom in [Julia](http://julialang.org).  A function that generates a vector or array is often defined to call a companion function that mutates one of its arguments.  The `randn!` function operates on a `VecOrMat{Float64}` overwriting the argument with i.i.d. standard normal variates.  If you check how `randn` is defined to return a vector it simply allocates the result and passes it to `randn!`.  The point it that, if you want to, you can overwrite an array in each step of a iteration and avoid needless allocation and garbage collection of vectors at every call to a function like `randn`.\n",
    "\n",
    "The `MvNormal` type is an abstract type encompassing many specialized cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "using Distributions, PDMats"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "search: "
     ]
    },
    {
     "data": {
      "text/latex": [
       "No documentation found.\n",
       "\\textbf{Summary:}\n",
       "\\begin{verbatim}\n",
       "immutable Distributions.MvNormal{Cov<:PDMats.AbstractPDMat{T<:AbstractFloat},Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}} <: Distributions.AbstractMvNormal\n",
       "\\end{verbatim}\n",
       "\\textbf{Fields:}\n",
       "\\begin{verbatim}\n",
       "μ :: Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}\n",
       "Σ :: Cov<:PDMats.AbstractPDMat{T<:AbstractFloat}\n",
       "\\end{verbatim}\n"
      ],
      "text/markdown": [
       "No documentation found.\n",
       "\n",
       "**Summary:**\n",
       "\n",
       "```julia\n",
       "immutable Distributions.MvNormal{Cov<:PDMats.AbstractPDMat{T<:AbstractFloat},Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}} <: Distributions.AbstractMvNormal\n",
       "```\n",
       "\n",
       "**Fields:**\n",
       "\n",
       "```julia\n",
       "μ :: Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}\n",
       "Σ :: Cov<:PDMats.AbstractPDMat{T<:AbstractFloat}\n",
       "```\n"
      ],
      "text/plain": [
       "No documentation found.\n",
       "\n",
       "**Summary:**\n",
       "\n",
       "```julia\n",
       "immutable Distributions.MvNormal{Cov<:PDMats.AbstractPDMat{T<:AbstractFloat},Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}} <: Distributions.AbstractMvNormal\n",
       "```\n",
       "\n",
       "**Fields:**\n",
       "\n",
       "```julia\n",
       "μ :: Mean<:Union{Array{Float64,1},Distributions.ZeroVector{Float64}}\n",
       "Σ :: Cov<:PDMats.AbstractPDMat{T<:AbstractFloat}\n",
       "```\n"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "MvNormal MvNormalCanon MvNormalKnownCov gmvnormal MvLogNormal\n",
      "\n"
     ]
    }
   ],
   "source": [
    "?MvNormal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is common to define a diffuse multivariate normal prior for unbounded vector parameters like coefficients.  These end up being special cases of the `MvNormal` type according to the template arguments."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ZeroMeanIsoNormal(\n",
       "dim: 2\n",
       "μ: [0.0,0.0]\n",
       "Σ: 2x2 Array{Float64,2}:\n",
       " 1.0e6  0.0  \n",
       " 0.0    1.0e6\n",
       ")\n"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d = MvNormal(2, 1000.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Distributions.ZeroVector{Float64}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "typeof(d.μ)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "PDMats.ScalMat{Float64}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "typeof(d.Σ)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "search: "
     ]
    },
    {
     "data": {
      "text/latex": [
       "No documentation found.\n",
       "\\textbf{Summary:}\n",
       "\\begin{verbatim}\n",
       "immutable PDMats.ScalMat{T<:AbstractFloat} <: PDMats.AbstractPDMat{T<:AbstractFloat}\n",
       "\\end{verbatim}\n",
       "\\textbf{Fields:}\n",
       "\\begin{verbatim}\n",
       "dim       :: Int64\n",
       "value     :: T<:AbstractFloat\n",
       "inv_value :: T<:AbstractFloat\n",
       "\\end{verbatim}\n"
      ],
      "text/markdown": [
       "No documentation found.\n",
       "\n",
       "**Summary:**\n",
       "\n",
       "```julia\n",
       "immutable PDMats.ScalMat{T<:AbstractFloat} <: PDMats.AbstractPDMat{T<:AbstractFloat}\n",
       "```\n",
       "\n",
       "**Fields:**\n",
       "\n",
       "```julia\n",
       "dim       :: Int64\n",
       "value     :: T<:AbstractFloat\n",
       "inv_value :: T<:AbstractFloat\n",
       "```\n"
      ],
      "text/plain": [
       "No documentation found.\n",
       "\n",
       "**Summary:**\n",
       "\n",
       "```julia\n",
       "immutable PDMats.ScalMat{T<:AbstractFloat} <: PDMats.AbstractPDMat{T<:AbstractFloat}\n",
       "```\n",
       "\n",
       "**Fields:**\n",
       "\n",
       "```julia\n",
       "dim       :: Int64\n",
       "value     :: T<:AbstractFloat\n",
       "inv_value :: T<:AbstractFloat\n",
       "```\n"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ScalMat WalleniusNoncentralHypergeometric FisherNoncentralHypergeometric\n",
      "\n"
     ]
    }
   ],
   "source": [
    "?ScalMat"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This type of MvNormal could be of dimension 1,000,000 and would still only require storing these three numbers.\n",
    "\n",
    "The next question to ask is what does `unwhiten!` do.  Essentially it is the multivariate equivalent of multiplying by $\\sigma$ in the univariate case.  If you want to take a univariate standard normal, `z`, and convert it to have a $\\mathcal{N}(\\mu, \\sigma^2)$ distribution you would form $\\mu + \\sigma z$.  The \"unwhitening\" here is essentially multiplying by a square root of `\\Sigma`, which is the Cholesky factor.\n",
    "\n",
    "If you have a sample from a $\\mathcal{N}(\\mu,\\sigma)$ distribution you convert it to \"white noise\" or \"whiten\" it with the Z transformation.  The transformation in the other direction is called \"unwhitening\".\n",
    "\n",
    "## Creating a Gibbs sampler for coefficients\n",
    "\n",
    "The Gibbs sampler for $\\beta$ in the [Mamba documentation](http://mambajl.readthedocs.org/en/latest/) uses standard formulas to get the mean and variance of a combination of multivariate normals.\n",
    "\n",
    "```julia\n",
    "Gibbs_beta = Sampler([:beta],\n",
    "  (beta, s2, xmat, y) ->\n",
    "    begin\n",
    "      beta_mean = mean(beta.distr)\n",
    "      beta_invcov = invcov(beta.distr)\n",
    "      Sigma = inv(xmat' * xmat / s2 + beta_invcov)\n",
    "      mu = Sigma * (xmat' * y / s2 + beta_invcov * beta_mean)\n",
    "      rand(MvNormal(mu, Sigma))\n",
    "    end\n",
    ")\n",
    "```\n",
    "\n",
    "The function indicated by the `->` mapping operator (what is sometimes called the \"stabby lambda\" syntax) produces the next value of $\\beta$.  This particular syntax is unusual in that it combines the \"one-liner\" form with a \"begin/end\" block.  It would be more common to write this as\n",
    "\n",
    "```julia\n",
    "Gibbs_beta = Sampler(:beta,\n",
    "    function (beta, s2, xmat, y)\n",
    "      beta_mean = mean(beta.distr)\n",
    "      beta_invcov = invcov(beta.distr)\n",
    "      Sigma = inv(xmat' * xmat / s2 + beta_invcov)\n",
    "      mu = Sigma * (xmat' * y / s2 + beta_invcov * beta_mean)\n",
    "      rand(MvNormal(mu, Sigma))\n",
    "    end\n",
    ")\n",
    "```\n",
    "\n",
    "Notice that `xmat' * xmat` and `xmat' * y` are recomputed at each iteration.  A slightly less obvious inefficiency comes from losing the information on the structure of the matrices involved. The matrix `xmat' * xmat` is positive semi-definite and symmetric but this information is lost."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5x2 Array{Float64,2}:\n",
       " 1.0  1.0\n",
       " 1.0  2.0\n",
       " 1.0  3.0\n",
       " 1.0  4.0\n",
       " 1.0  5.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xmat = hcat(ones(5), collect(1:5))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2x2 Array{Float64,2}:\n",
       "  5.0  15.0\n",
       " 15.0  55.0"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "xmat' * xmat"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We also lose the special structure of the covariance of the prior."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2x2 Array{Float64,2}:\n",
       " 1.0e-6  0.0   \n",
       " 0.0     1.0e-6"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "invcov(d)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "None of these issues are at all important for this example.  In large examples, however, one might want to be more careful and store `X'X` and `X'y`.  It would look like"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "updateβ (generic function with 1 method)"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function updateβ(β, σ², XtX, Xty)\n",
    "    d = β.distr\n",
    "    if isa(d, ZeroMeanIsoNormal)\n",
    "        cfac = cholfact!(XtX + σ² * d.Σ.inv_value * I, :L)\n",
    "        return cfac\\Xty + √σ² * cfac[:L]' \\ randn(length(β))\n",
    "    else\n",
    "        throw(ArgumentError(\n",
    "        \"prior for coefficients should be ZeroMeanIsoNormal\"))\n",
    "    end\n",
    "end"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 0.4.3",
   "language": "julia",
   "name": "julia-0.4"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.4.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}