{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 17: Moment Generating Functions (MGFs), hybrid Bayes' rule, Laplace's rule of succession\n", "\n", "\n", "## Stat 110, Prof. Joe Blitzstein, Harvard University\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## $\\operatorname{Expo}(\\lambda)$ and the Memorylessness Property\n", "\n", "#### Theorem: If $X$ is a positive, continuous r.v. with the memorylessness property, then $X \\sim \\operatorname{Expo}(\\lambda)$ for some $\\lambda$.\n", "\n", "Let $F$ be the CDF of $X$, $G(x) = P(X \\ge x) = 1 - F(x)$.\n", "\n", "With the memoryless property, $G(s + t) = G(s) \\, G(t)$.\n", "\n", "This time, rather than trying to solve for $s$ or $t$, we are going to solve for the function $G$, in order to show that it is only the exponential function that has the memorylessness property (in the continuous case).\n", "\n", "\\begin{align}\n", " & \\text{let } s = t & \\quad \\\\\n", " & \\Rightarrow & G(2t) &= G(t + t) = G(t) \\, G(t) = G(t)^{2} & \\quad \\\\\n", " & & G(3t) &= G(2t) \\, G(t) = G(t)^{2} \\, G(t) = G(t)^{3} & \\quad \\\\\n", " & &\\dots & \\quad \\\\\n", " & & G(kt) &= G(t)^{k} & \\quad \\\\\n", " \\\\\n", " & \\text{case where } k = \\frac{1}{n} & \\quad \\\\\n", " & \\Rightarrow & G\\left(2 \\, \\frac{t}{2}\\right) &= G\\left(\\frac{t}{2}\\right)^{2} \\text{ so } G\\left(\\frac{t}{2}\\right) = \\sqrt{G(t)} = G(t)^{1/2} & \\quad \\\\\n", " & & G\\left(\\frac{t}{3}\\right) &= G(t)^{1/3} & \\quad \\\\ \n", " & &\\dots & \\quad \\\\\n", " & & G\\left(\\frac{t}{k}\\right) &= G(t)^{1/k} & \\quad \\\\\n", " \\\\\n", " & \\text{case where } k = \\frac{m}{n} & \\quad \\\\\n", " & \\Rightarrow & G\\left(\\frac{m}{n} \\, t \\right) &= G(t)^{m/n} & \\quad \\\\\n", " \\\\\n", " & \\text{let } x \\in \\mathbb{Q} & \\quad \\\\\n", " & \\Rightarrow & G(x \\, t) &= G(t)^{x} ~~~~ \\text{ for all } x \\ge 0\\\\\n", " \\\\\n", " \\\\\n", " & \\text{now let } t =1 & \\quad \\\\\n", " & \\Rightarrow & G(x) &= G(1)^{x} & \\quad \\\\ \n", " & & &= e^{x \\, ln \\, G(1)} ~~~~ \\text{ where } ln \\, G(1) \\text{ is some negative real number } \\\\\n", " & & &= e^{-\\lambda x} & \\quad \\blacksquare \\\\\n", "\\end{align}\n", "\n", "And so now we see that in the continuous case, $\\operatorname{Expo}(\\lambda)$ is the only distribution with the memorylessness property." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Moment Generating Function (MGF)\n", "\n", "Moment generating functions are an alternative way to describe a distribution.\n", "\n", "#### Definition\n", "\n", "A random variable $X$ has MGF $M(t) = \\mathbb{E}(e^{tX})$, as a function of $t$, if this is finite on some $(-a, a)$ where $a > 0$.\n", "\n", "Note that any function of a random variable _is itself a random variable_, so it makes some sense that we can obtain the expected value $\\mathbb{E}(e^{tX})$ \n", "\n", "But why is this called _moment-generating_?\n", "\n", "\\begin{align}\n", " \\mathbb{E}(e^{tX}) &= \\mathbb{E}\\left(\\sum_{n=0}^{\\infty} \\frac{X^{n} \\, t^{n}}{n!} \\right) &\\quad \\text{Taylor expand e} \\\\\n", " &= \\sum_{n=0}^{\\infty} \\left( \\frac{\\mathbb{E}(X^{n}) \\, t^{n}}{n!}\\right) &\\quad \\text{ where } \\mathbb{E}(X^{n}) \\text{ is called the } n^{th} \\text{ moment} \\\\\n", "\\end{align}\n", "\n", "#### Moments\n", "\n", "* the average value for a random variable $X$ $\\mathbb{E}(X)$ is known as the _first moment_\n", "* the _second moment_ of $X$ is $\\mathbb{E}(X^{2})$ which helps use derive $\\operatorname{Var}(X)$\n", "* higher moments are easily generated (derived), as well\n", "\n", "### 3 reasons why MGF is important\n", "\n", "Let $X$ have MGF $M(t)$.\n", "\n", "1. The $n$-th moment $\\mathbb{E}(X^{n})$ is the coefficient of $\\frac{t^{n}}{n!}$ in the Taylor series of $M$, i.e., $M^{n}(0) = \\mathbb{E}(X^{n})$\n", "1. MGF determines the distribution, i.e., if $X$ and $Y$ have the same MGF, then they have the same CDF\n", "1. sums of random variables (convolutions) are difficult; but if we have MGFs, they are _easy_\n", "\n", "### Sums of MGFs\n", "\n", "If we have _independent_ r.v. $X$ and $Y$, and we know their respective moment generating functions, then we can easily find the moment generating function for $X + Y$\n", "\n", "\\begin{align}\n", " M(X + Y) &= \\mathbb{E}(e^{t(X+Y)}) \\\\\n", " &= \\mathbb{E}(e^{tX}) \\, \\mathbb{E}(e^{tY}) &\\quad \\text{ by independence}\n", "\\end{align}\n", "\n", "\n", "### MGF for $Bern(p)$\n", "\n", "Given $X \\sim \\operatorname{Bern}(p)$, we obtain the MGF with\n", "\n", "\\begin{align}\n", " M(t) &= \\mathbb{E}(e^{tX}) \\\\\n", " &= p \\, e^t * q &\\quad \\text{ where } q = 1-p\n", "\\end{align}\n", "\n", "### MGF for $\\operatorname{Bin}(p)$\n", "\n", "Given $X \\sim Bin(n,p)$, we obtain the MGF with\n", "\n", "\\begin{align}\n", " M(t) &= \\mathbb{E}(e^{tX}) \\\\\n", " &= \\left( p \\, e^t + q \\right)^n &\\quad \\text{ by applying } G(kt) = G(t)^{k} \n", "\\end{align}\n", "\n", "\n", "### MGF for standard normal $Z \\sim \\mathcal{N}(0,1)$\n", "\n", "Given standard normal $Z \\sim \\mathcal{N}(0,1)$, we obtain the MGF with\n", "\n", "\\begin{align}\n", " M(t) &= \\frac{1}{\\sqrt{2\\pi}} \\int_{-\\infty}^{\\infty} e^{tZ - Z^2/2} \\, dZ \\\\\n", " &= \\frac{1}{\\sqrt{2\\pi}} ~~ e^{t^2/2} \\int_{-\\infty}^{\\infty} e^{-\\frac{1}{2}\\,(Z-t)^2} \\, dZ &\\quad \\text{ completing the square} \\\\\n", " &= \\frac{1}{\\sqrt{2\\pi}} ~~ e^{t^2/2} ~~ \\sqrt{2\\pi} &\\quad \\text{ recall the PDF of standard normal (Lec. 13)} \\\\\n", " &= e^{t^2/2}\n", "\\end{align}\n", "\n", "\\*And just in case you've forgotten how to [complete the square](https://www.youtube.com/watch?v=bclm1tJB-3g)...\n", "\n", "\n", "### MGF for normal $X \\sim \\mathcal{N}(\\mu, \\sigma^2)$\n", "\n", "\\begin{align}\n", " M(t) &= \\mathbb{E}(e^{tX}) \\\\\n", " &= \\int_{-\\infty}^{\\infty} e^{tx} \\, \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2} \\left(\\frac{x - \\mu}{\\sigma}\\right)^2} \\, dx \\\\\n", " &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{\\left( x^2 - 2x\\mu + \\mu^2 \\right)}{2\\sigma^2} + tx} \\, dx \\\\\n", " &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{x^2 - 2x\\mu - 2\\sigma^{2}tx + \\mu^2}{2\\sigma^2}} \\, dx \\\\\n", " &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2\\sigma^2}x^2 - 2x(\\mu + \\sigma^{2}t) + \\mu^2} \\, dx \\\\\n", " &= \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2\\sigma^2} (x - (\\mu + \\sigma^{2}t))^2 - (\\mu + \\sigma^{2}t)^2 + \\mu^2} \\, dx \\\\\n", " &= e^{-\\frac{1}{2\\sigma^2} - (\\mu + \\sigma^{2}t)^2 + \\mu^2} \\int_{-\\infty}^{\\infty} \\frac{1}{\\sigma \\sqrt{2\\pi}} e^{-\\frac{1}{2} \\left( \\frac{(x - (\\mu + \\sigma^{2}t))}{\\sigma} \\right)^2} \\, dx \\\\\n", " &= e^{-\\frac{1}{2\\sigma^2} (- \\mu^2 - 2 \\mu \\sigma^{2}t - \\sigma^{4}t^2 + \\mu^2)} \\\\\n", " &= e^{\\frac{2 \\mu \\sigma^{2}t + \\sigma^{4}t^2}{2\\sigma^2} } \\\\\n", " &= e^{\\mu t + \\frac{\\sigma^2 t^2}{2} }\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Laplace's Rule of Succession\n", "\n", "_If we have observed the sun rising for the past $n$ days in succession, then what is the probability that the sun will rise tomorrow?_\n", "\n", "Given $p$ is the probability that the sun will rise on any given day $X_k$, we can consider a consecutive string of days $X_1, X_2, \\dots \\text{ i.i.d. } \\operatorname{Bern}(p)$ which is conditional on $p$. But for the question above, we do not know what $p$ is. Bayesians treat $p$ as an r.v.\n", "\n", "### Problem structure\n", "\n", "* Let $p \\sim \\operatorname{Unif}(0,1)$ be our _prior_; we choose $\\operatorname{Unif}(0,1)$ since $p$ could be _anything_\n", "* Let $S_n = X_1 + X_2 + \\cdots + X_n$\n", "* So we then assume $S_n | p \\sim \\operatorname{Bin}(n,p) \\text{, } p \\sim \\operatorname{Unif}(0,1)$\n", "\n", "### Questions\n", "\n", "1. What is the _posterior_ $p | S_n$?\n", "1. What is $P(X_{n+1} | S_n)$, the probability that the sun will rise tomorrow given that we have observed so for the past $n$ days?\n", "\n", "### Solution\n", "\n", "We use $f$ as a simple stand-in for the PDF $p$. We start with the general case:\n", "\n", "\\begin{align}\n", " f( p | S_n=k) &= \\frac{P(S_n=k | p) f(p)}{P(S_n=k)} &\\quad \\text{ from Bayes' Rule} \\\\\n", " &\\propto p^k \\, (1-p)^{n-k}\n", "\\end{align}\n", "\n", "But since\n", "* the _prior_ $f(p) = 1$ since it is Uniform.\n", "* $P(S_n = k)$ does not depend on $p$\n", "* the binomial coefficient of $P(p | S_n=k) = \\binom{n}{k} p^k \\, (1-p)^{n-k}$ also does not depend on $p$, and can be treated as a constant\n", "\n", "we can consider $f(p | S_n=k) $ with proportionality.\n", "\n", "Now let's consider the case of our question, where the sun has risen for $n$ days straight:\n", "\n", "\\begin{align}\n", " \\text{since } f(p) &= \\int_{0}^{1} p^n \\, dp \\\\\n", " &= \\frac{1}{n+1} \n", " \\\\\n", " \\\\\n", " \\text{so } (p | S_n=n) &= \\boxed{(n+1) \\, p^n} &\\quad \\text{ normalizing for a valid PDF}\\\\\n", " \\\\\n", " \\text{and } P(X_{n+1}=1 | S_n=n) &= \\int_{0}^{1} (n+1) \\, p \\, p^n \\, dp &\\quad \\text{ Fundamental Bridge, } \\mathbb{E}(p | S_n=n) \\\\ \n", " &= \\int_{0}^{1} (n+1) \\, p^{n+1} \\, dp \\\\\n", " &= \\boxed{\\frac{n+1}{n+2}}\n", "\\end{align}\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "View [Lecture 17: Moment Generating Functions | Statistics 110](http://bit.ly/2CxVsgR) on YouTube." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 1 }