{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "141806bc",
   "metadata": {},
   "source": [
    "# Cheatsheet\n",
    "\n",
    "- You are not allowed to bring books or notes to the exam. Instead, feel free to make use of the following cheatsheet as we will provide this or a similar cheatsheet in an appendix of the exam papers.\n",
    "\n",
    "- Some <a id=matrix-calculus>Matrix Calculus</a>, see also Bishop, appendix C.  \n",
    "$$\\begin{align*}\n",
    "|A^{-1}|&=|A|^{-1} \\\\\n",
    "\\nabla_A \\log |A| &= (A^{T})^{-1} = (A^{-1})^T \\\\\n",
    "\\mathrm{Tr}[ABC]&= \\mathrm{Tr}[CAB] = \\mathrm{Tr}[BCA]  \\\\\n",
    "\\nabla_A \\mathrm{Tr}[AB] &=\\nabla_A \\mathrm{Tr}[BA]= B^T  \\\\\n",
    "\\nabla_A \\mathrm{Tr}[ABA^T] &= A(B+B^T)\\\\\n",
    " \\nabla_x x^TAx &= (A+A^T)x\\\\\n",
    "\\nabla_X a^TXb &= \\nabla_X \\mathrm{Tr}[ba^TX] = ab^T\n",
    "\\end{align*}$$\n",
    "\n",
    "- Definition of the Multivariate Gaussian Distribution (MVG)\n",
    "$$\n",
    "\\mathcal{N}(x|\\,\\mu,\\Sigma) = |2 \\pi \\Sigma|^{-\\frac{1}{2}} \\exp\\left\\{-\\frac{1}{2}(x-\\mu)^T\n",
    "\\Sigma^{-1} (x-\\mu) \\right\\}\n",
    "$$\n",
    "\n",
    "- A **linear transformation** $z=Ax+b$ of a Gaussian variable $\\mathcal{N}(x|\\mu,\\Sigma)$ is Gaussian distributed as\n",
    "$$\n",
    "p(z) = \\mathcal{N} \\left(z \\,|\\, A\\mu+b, A\\Sigma A^T \\right) \n",
    "$$\n",
    "\n",
    "- **Multiplication** of 2 Gaussian distributions\n",
    "$$ \n",
    " \\mathcal{N}(x|\\mu_a,\\Sigma_a) \\cdot  \\mathcal{N}(x|\\mu_b,\\Sigma_b) = \\alpha \\cdot \\mathcal{N}(x|\\mu_c,\\Sigma_c)\n",
    "$$\n",
    "with\n",
    "$$\\begin{align*}\n",
    "\\Sigma_c^{-1} &= \\Sigma_a^{-1} + \\Sigma_b^{-1} \\\\\n",
    "\\Sigma_c^{-1}\\mu_c &= \\Sigma_a^{-1}\\mu_a + \\Sigma_b^{-1}\\mu_b \\\\\n",
    "\\alpha &= \\mathcal{N}(\\mu_a | \\mu_b, \\Sigma_a + \\Sigma_b)\n",
    "\\end{align*}$$\n",
    "\n",
    "- **Conditioning** and **marginalization** of Gaussians. Let $z = \\begin{bmatrix} x \\\\ y \\end{bmatrix}$ be jointly normal distributed as\n",
    "$$\\begin{align*}\n",
    "p(z) &= \\mathcal{N}(z | \\mu, \\Sigma) \n",
    "  =\\mathcal{N} \\left( \\begin{bmatrix} x \\\\ y \\end{bmatrix} \\,\\left|\\, \\begin{bmatrix} \\mu_x \\\\ \\mu_y \\end{bmatrix}, \n",
    "  \\begin{bmatrix} \\Sigma_x & \\Sigma_{xy} \\\\ \\Sigma_{yx} & \\Sigma_y \\end{bmatrix} \\right. \\right)\\,,\n",
    "\\end{align*}$$\n",
    "then $p(z) = p(y|x)\\cdot p(x)$, with \n",
    "$$\\begin{align*}\n",
    "p(y|x) &= \\mathcal{N}\\left(y\\,|\\,\\mu_y + \\Sigma_{yx}\\Sigma_x^{-1}(x-\\mu_x),\\, \\Sigma_y - \\Sigma_{yx}\\Sigma_x^{-1}\\Sigma_{xy} \\right) \\\\\n",
    "p(x) &= \\mathcal{N}\\left( x\\,|\\,\\mu_x, \\Sigma_x \\right)\n",
    "\\end{align*}$$\n",
    "\n",
    "- For a binary variable $x \\in \\{0,1\\}$, the **Bernoulli** distribution is given by\n",
    "$$ \n",
    "p(x|\\mu) = \\mu^{x}(1-\\mu)^{1-x}\n",
    "$$\n",
    "\n",
    "- The conjugate prior for $\\mu$ is the **Beta** distribution, given by\n",
    "$$\n",
    "p(\\mu) = \\mathcal{B}(\\mu|\\alpha,\\beta) = \\frac{\\Gamma(\\alpha+\\beta)}{\\Gamma(\\alpha)\\Gamma(\\beta)} \\mu^{\\alpha-1}(1-\\mu)^{\\beta-1}\n",
    "$$\n",
    "where $\\alpha$ and $\\beta$ are \"hyperparameters\" that you can set to reflect your prior beliefs about $\\mu$. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7b4e7ed5",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.5.2",
   "language": "julia",
   "name": "julia-1.5"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}