{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Lecture 17: Moment Generating Functions (MGFs), hybrid Bayes' rule, Laplace's rule of succession\n",
    "\n",
    "\n",
    "## Stat 110, Prof. Joe Blitzstein, Harvard University\n",
    "\n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## $\\operatorname{Expo}(\\lambda)$ and the  Memorylessness Property\n",
    "\n",
    "#### Theorem: If $X$ is a positive, continuous r.v. with the memorylessness property, then $X \\sim \\operatorname{Expo}(\\lambda)$ for some $\\lambda$.\n",
    "\n",
    "Let $F$ be the CDF of $X$, $G(x) = P(X \\ge x) = 1 - F(x)$.\n",
    "\n",
    "With the memoryless property, $G(s + t) = G(s) \\, G(t)$.\n",
    "\n",
    "This time, rather than trying to solve for $s$ or $t$, we are going to solve for the function $G$, in order to show that it is only the exponential function that has the memorylessness property (in the continuous case).\n",
    "\n",
    "\\begin{align}\n",
    "  & \\text{let } s = t & \\quad \\\\\n",
    "  & \\Rightarrow & G(2t) &= G(t + t) = G(t) \\, G(t) = G(t)^{2} & \\quad \\\\\n",
    "  & & G(3t) &= G(2t) \\, G(t) = G(t)^{2} \\, G(t) = G(t)^{3} & \\quad \\\\\n",
    "  & &\\dots & \\quad \\\\\n",
    "  & & G(kt) &= G(t)^{k} & \\quad  \\\\\n",
    "  \\\\\n",
    "  & \\text{case where } k = \\frac{1}{n} & \\quad \\\\\n",
    "  & \\Rightarrow & G\\left(2 \\, \\frac{t}{2}\\right) &= G\\left(\\frac{t}{2}\\right)^{2} \\text{ so } G\\left(\\frac{t}{2}\\right) = \\sqrt{G(t)} = G(t)^{1/2} & \\quad \\\\\n",
    "  & & G\\left(\\frac{t}{3}\\right) &=  G(t)^{1/3} & \\quad \\\\ \n",
    "  & &\\dots & \\quad \\\\\n",
    "  & & G\\left(\\frac{t}{k}\\right) &= G(t)^{1/k} & \\quad \\\\\n",
    "  \\\\\n",
    "  & \\text{case where } k = \\frac{m}{n} & \\quad \\\\\n",
    "  & \\Rightarrow & G\\left(\\frac{m}{n} \\, t \\right) &= G(t)^{m/n} & \\quad \\\\\n",
    "  \\\\\n",
    "  & \\text{let } x \\in \\mathbb{Q} & \\quad \\\\\n",
    "  & \\Rightarrow & G(x \\, t) &= G(t)^{x} ~~~~ \\text{ for all } x \\ge 0\\\\\n",
    "  \\\\\n",
    "  \\\\\n",
    "  & \\text{now let } t =1 & \\quad \\\\\n",
    "  & \\Rightarrow & G(x) &= G(1)^{x} & \\quad \\\\ \n",
    "  &             &      &= e^{x \\, ln \\, G(1)} ~~~~ \\text{ where } ln \\, G(1) \\text{ is some negative real number } \\\\\n",
    "  &             &      &= e^{-\\lambda x} & \\quad \\blacksquare \\\\\n",
    "\\end{align}\n",
    "\n",
    "And so now we see that in the continuous case, $\\operatorname{Expo}(\\lambda)$ is the only distribution with the memorylessness property."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Moment Generating Function (MGF)\n",
    "\n",
    "Moment generating functions are an alternative way to describe a distribution.\n",
    "\n",
    "#### Definition\n",
    "\n",
    "A random variable $X$ has MGF $M(t) = \\mathbb{E}(e^{tX})$, as a function of $t$, if this is finite on some $(-a, a)$ where $a > 0$.\n",
    "\n",
    "Note that any function of a random variable _is itself a random variable_, so it makes some sense that we can obtain the expected value $\\mathbb{E}(e^{tX})$ \n",
    "\n",
    "But why is this called _moment-generating_?\n",
    "\n",
    "\\begin{align}\n",
    "  \\mathbb{E}(e^{tX}) &= \\mathbb{E}\\left(\\sum_{n=0}^{\\infty} \\frac{X^{n} \\, t^{n}}{n!} \\right) &\\quad \\text{Taylor expand e} \\\\\n",
    "  &= \\sum_{n=0}^{\\infty} \\left( \\frac{\\mathbb{E}(X^{n}) \\, t^{n}}{n!}\\right) &\\quad \\text{ where } \\mathbb{E}(X^{n}) \\text{ is called the } n^{th} \\text{ moment} \\\\\n",
    "\\end{align}\n",
    "\n",
    "#### Moments\n",
    "\n",
    "* the average value for a random variable $X$ $\\mathbb{E}(X)$ is known as the _first moment_\n",
    "* the _second moment_ of $X$ is $\\mathbb{E}(X^{2})$ which helps use derive $\\operatorname{Var}(X)$\n",
    "* higher moments are easily generated (derived), as well\n",
    "\n",
    "### 3 reasons why MGF is important\n",
    "\n",
    "Let $X$ have MGF $M(t)$.\n",
    "\n",
    "1. The $n$-th moment $\\mathbb{E}(X^{n})$ is the coefficient of $\\frac{t^{n}}{n!}$  in the Taylor series of $M$, i.e., $M^{n}(0) = \\mathbb{E}(X^{n})$\n",
    "1. MGF determines the distribution, i.e., if $X$ and $Y$ have the same MGF, then they have the same CDF\n",
    "1. sums of random variables (convolutions) are difficult; but if we have MGFs, they are _easy_\n",
    "\n",
    "### Sums of MGFs\n",
    "\n",
    "If we have _independent_ r.v. $X$ and $Y$, and we know their respective moment generating functions, then we can easily find the moment generating function for $X + Y$\n",
    "\n",
    "\\begin{align}\n",
    "  M(X + Y) &= \\mathbb{E}(e^{t(X+Y)}) \\\\\n",
    "           &= \\mathbb{E}(e^{tX}) \\, \\mathbb{E}(e^{tY}) &\\quad \\text{ by independence}\n",
    "\\end{align}\n",
    "\n",
    "\n",
    "### MGF for $Bern(p)$\n",
    "\n",
    "Given $X \\sim \\operatorname{Bern}(p)$, we obtain the MGF with\n",
    "\n",
    "\\begin{align}\n",
    "  M(t) &= \\mathbb{E}(e^{tX}) \\\\\n",
    "       &= p \\, e^t * q &\\quad \\text{ where } q = 1-p\n",
    "\\end{align}\n",
    "\n",
    "### MGF for $\\operatorname{Bin}(p)$\n",
    "\n",
    "Given $X \\sim Bin(n,p)$, we obtain the MGF with\n",
    "\n",
    "\\begin{align}\n",
    "  M(t) &= \\mathbb{E}(e^{tX}) \\\\\n",
    "       &= \\left( p \\, e^t + q \\right)^n &\\quad \\text{ by applying } G(kt) = G(t)^{k} \n",
    "\\end{align}\n",
    "\n",
    "\n",
    "### MGF for standard normal $Z \\sim \\mathcal{N}(0,1)$\n",
    "\n",
    "Given standard normal $Z \\sim \\mathcal{N}(0,1)$, we obtain the MGF with\n",
    "\n",
    "\\begin{align}\n",
    "  M(t) &= \\frac{1}{\\sqrt{2\\pi}} \\int_{-\\infty}^{\\infty} e^{tZ - Z^2/2} \\, dZ \\\\\n",
    "       &= \\frac{1}{\\sqrt{2\\pi}} ~~ e^{t^2/2} \\int_{-\\infty}^{\\infty} e^{-\\frac{1}{2}\\,(Z-t)^2} \\, dZ &\\quad \\text{ completing the square} \\\\\n",
    "       &= \\frac{1}{\\sqrt{2\\pi}} ~~ e^{t^2/2} ~~ \\sqrt{2\\pi} &\\quad \\text{ recall the PDF of standard normal (Lec. 13)}  \\\\\n",
    "       &= e^{t^2/2}\n",
    "\\end{align}\n",
    "\n",
    "\\*And just in case you've forgotten how to [complete the square](https://www.youtube.com/watch?v=bclm1tJB-3g)...\n",
    "\n",
    "\n",
    "### MGF for normal $X \\sim \\mathcal{N}(\\mu, \\sigma^2)$\n",
    "\n",
    "\\begin{align}\n",
    "  M(t) &= \\mathbb{E}(e^{tX}) \\\\\n",
    "       &= \\int_{-\\infty}^{\\infty} e^{tx} \\, \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2} \\left(\\frac{x - \\mu}{\\sigma}\\right)^2} \\, dx \\\\\n",
    "       &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{\\left( x^2 - 2x\\mu + \\mu^2 \\right)}{2\\sigma^2} + tx} \\, dx \\\\\n",
    "       &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{x^2 - 2x\\mu - 2\\sigma^{2}tx + \\mu^2}{2\\sigma^2}} \\, dx \\\\\n",
    "       &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2\\sigma^2}x^2 - 2x(\\mu + \\sigma^{2}t) + \\mu^2} \\, dx \\\\\n",
    "       &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2\\sigma^2} (x - (\\mu + \\sigma^{2}t))^2 - (\\mu + \\sigma^{2}t)^2 + \\mu^2} \\, dx \\\\\n",
    "       &= e^{-\\frac{1}{2\\sigma^2} - (\\mu + \\sigma^{2}t)^2 + \\mu^2} \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2} \\left( \\frac{(x - (\\mu + \\sigma^{2}t))}{\\sigma} \\right)^2}  \\, dx \\\\\n",
    "       &= e^{-\\frac{1}{2\\sigma^2} (- \\mu^2 - 2 \\mu \\sigma^{2}t - \\sigma^{4}t^2 + \\mu^2)} \\\\\n",
    "       &= e^{\\frac{2 \\mu \\sigma^{2}t + \\sigma^{4}t^2}{2\\sigma^2} } \\\\\n",
    "       &= e^{\\mu t + \\frac{\\sigma^2 t^2}{2} }\n",
    "\\end{align}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Laplace's Rule of Succession\n",
    "\n",
    "_If we have observed the sun rising for the past $n$ days in succession, then what is the probability that the sun will rise tomorrow?_\n",
    "\n",
    "Given $p$ is the probability that the sun will rise on any given day $X_k$, we can consider a consecutive string of days $X_1, X_2, \\dots \\text{ i.i.d. } \\operatorname{Bern}(p)$ which is conditional on $p$. But for the question above, we do not know what $p$ is. Bayesians treat $p$ as an r.v.\n",
    "\n",
    "### Problem structure\n",
    "\n",
    "* Let $p \\sim \\operatorname{Unif}(0,1)$ be our _prior_; we choose $\\operatorname{Unif}(0,1)$ since $p$ could be _anything_\n",
    "* Let $S_n = X_1 + X_2 + \\cdots + X_n$\n",
    "* So we then assume $S_n | p \\sim \\operatorname{Bin}(n,p) \\text{, } p \\sim \\operatorname{Unif}(0,1)$\n",
    "\n",
    "### Questions\n",
    "\n",
    "1. What is the _posterior_ $p | S_n$?\n",
    "1. What is $P(X_{n+1} | S_n)$, the probability that the sun will rise tomorrow given that we have observed so for the past $n$ days?\n",
    "\n",
    "### Solution\n",
    "\n",
    "We use $f$ as a simple stand-in for the PDF $p$. We start with the general case:\n",
    "\n",
    "\\begin{align}\n",
    "  f( p | S_n=k) &= \\frac{P(S_n=k | p) f(p)}{P(S_n=k)} &\\quad \\text{ from Bayes' Rule} \\\\\n",
    "                &\\propto p^k \\, (1-p)^{n-k}\n",
    "\\end{align}\n",
    "\n",
    "But since\n",
    "* the _prior_ $f(p) = 1$ since it is Uniform.\n",
    "* $P(S_n = k)$ does not depend on $p$\n",
    "* the binomial coefficient of $P(p | S_n=k) = \\binom{n}{k} p^k \\, (1-p)^{n-k}$ also does not depend on $p$, and can be treated as a constant\n",
    "\n",
    "we can consider $f(p | S_n=k) $ with proportionality.\n",
    "\n",
    "Now let's consider the case of our question, where the sun has risen for $n$ days straight:\n",
    "\n",
    "\\begin{align}\n",
    "  \\text{since } f(p) &= \\int_{0}^{1} p^n \\, dp \\\\\n",
    "       &= \\frac{1}{n+1} \n",
    "  \\\\\n",
    "  \\\\\n",
    "  \\text{so } (p | S_n=n) &= \\boxed{(n+1) \\, p^n} &\\quad \\text{ normalizing for a valid PDF}\\\\\n",
    "  \\\\\n",
    "  \\text{and } P(X_{n+1}=1 | S_n=n) &= \\int_{0}^{1} (n+1) \\, p \\, p^n \\, dp &\\quad \\text{ Fundamental Bridge, } \\mathbb{E}(p | S_n=n) \\\\ \n",
    "       &= \\int_{0}^{1} (n+1) \\, p^{n+1} \\, dp \\\\\n",
    "       &= \\boxed{\\frac{n+1}{n+2}}\n",
    "\\end{align}\n",
    "\n",
    "----"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "View [Lecture 17: Moment Generating Functions | Statistics 110](http://bit.ly/2CxVsgR) on YouTube."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}