{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 26: Two envelope paradox (cont.), conditional expectation (cont.), waiting for HT vs. waiting for HH\n", "\n", "\n", "## Stat 110, Prof. Joe Blitzstein, Harvard University\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Two-envelope Paradox (continued)\n", "\n", "\n", "![title](images/L2601.png)\n", "\n", "From [Wikipedia](https://en.wikipedia.org/wiki/Two_envelopes_problem):\n", "\n", "**Basic setup:** You are given two indistinguishable envelopes, each of which contains a positive sum of money. One envelope contains twice as much as the other. You may pick one envelope and keep whatever amount it contains. You pick one envelope at random but before you open it you are given the chance to take the other envelope instead.\n", "\n", "There is no indication as to which of $X$ and $Y$ contains the lesser/greater amount.\n", "\n", "Let's consider two competing arguments:\n", "\n", "\\begin{align}\n", " &\\text{argument 1: } &\\quad \\mathbb{E}(Y) &= \\mathbb{E}(X) \\\\\n", " \\\\\n", " &\\text{argument 2: } &\\quad \\mathbb{E}(Y) &= \\mathbb{E}(Y|Y=2X) \\, P(Y=2X) \\, + \\, \\mathbb{E}(Y|Y=\\frac{X}{2}) \\, P(Y=\\frac{X}{2}) \\\\\n", " & & &= \\mathbb{E}(2X) \\, \\frac{1}{2} \\, + \\, \\mathbb{E}(\\frac{X}{2}) \\, \\frac{1}{2} \\\\\n", " & & &= \\frac{5}{4} \\, \\mathbb{E}(X)\n", "\\end{align}\n", "\n", "So which argument is correct?\n", "\n", "Argument 1 is _symmetry_, and that takes precedence.\n", "\n", "Argument 2, however, has a flaw: we start with a condition, but right after assuming that we equate $\\mathbb{E}(Y|Y=2X)$ with $\\mathbb{E}(2X)$ and $\\mathbb{E}(Y|Y=\\frac{X}{2})$ with $\\mathbb{E}(\\frac{X}{2})$, and then we aren't conditioning any more. There is no reason to confuse a conditional probability with an _unconditional_ probability.\n", "\n", "* Let $I$ be the indicator of $Y=2X$\n", "* Then $X,I$ are _dependent_\n", "* Therefore $\\mathbb{E}(Y|Y=2X) \\neq \\mathbb{E}(2X)$" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Patterns in Coin Flips\n", "\n", "Continuing with a further example of conditional expectation, consider repeated trials of coin flips using a fair coin.\n", "\n", "* How many flips until $HT$ (how many flips until you run into an $H$ followed by a $T$, including those flips)? _Let us call this event\n", "$W_{HT}$._\n", "* How many flips until $HH$ (how many flips until you run into an $H$ followed by another $H$, including those flips)? _Let us call this event $W_{HH}$._\n", "\n", "So what we are really looking for are $\\mathbb{E}(W_{HT})$ and $\\mathbb{E}(W_{HH})$. Which do you think is greater? Are the two _equal_, is $W_{HT} \\lt W_{HH}$, or is $W_{HT} \\gt W_{HH}$?\n", "\n", "If you think they are equal by _symmetry_, then you're wrong. By symmetry we know:\n", "\n", "\\begin{align}\n", " & & \\mathbb{E}(W_{TT}) &= \\mathbb{E}(W_{HH}) \\\\\n", " & & \\mathbb{E}(W_{HT}) &= \\mathbb{E}(W_{TH}) \\\\\n", " \\\\\n", " &\\text{but } & \\mathbb{E}(W_{HT}) &\\neq \\mathbb{E}(W_{HH}\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### $\\mathbb{E}(W_{HT})$\n", "\n", "Consider first $\\mathbb{E}(W_{HT})$; we can solve for this without using conditional expectation if we just think about things. \n", "\n", "![title](images/L2602.png)\n", "\n", "From the picture above, you can see that by the time we get the first $H$, we are actually halfway done. With this partial progress, all we need now is to see a $T$. If we see another $H$, that is OK, and we still keep waiting for a $T$. If we call the number of flips until the first $H$ $W_{1}$, then we can call the number of coin flips after that wait until we see $T$ $W_{2}$.\n", "\n", "It is easy to recognize that $W_{1}, W_{2} \\sim 1 - \\operatorname{Geom}(\\frac{1}{2})$, where support $k \\in \\{0,1,2,\\dots\\}$\n", "\n", "So we have\n", "\n", "\\begin{align}\n", " \\mathbb{E}(W_{HT}) &= \\mathbb{E}(W_1) + \\mathbb{E}(W_2) \\\\\n", " &= \\left[\\frac{1 - {1/2}}{1/2} + 1 \\right] + \\left[\\frac{1 - {1/2}}{1/2} + 1 \\right] \\\\\n", " &= 1 + 1 + 1 + 1 \\\\\n", " &= \\boxed{4}\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### $\\mathbb{E}(W_{HH})$\n", "\n", "Now let's consider $\\mathbb{E}(W_{HH})$\n", "\n", "![title](images/L2603.png)\n", "\n", "In this case, if we get another $H$ immediately after seeing the first $H$, then we are done. But if we don't get $H$, then we have to start all over again and so we don't enjoy any partial progress.\n", "\n", "Let's solve this using _conditional expectation_. \n", "\n", "Similar to how we solved Gambler's Ruin by _conditioning on the first toss_, we have\n", "\n", "\\begin{align}\n", " \\mathbb{E}(W_{HH}) &= \\mathbb{E}(W_{HH} | \\text{first toss is } H) \\frac{1}{2} + \\mathbb{E}(W_{HH} | \\text{first toss is } T) \\frac{1}{2} \\\\\n", " &= \\left( 2 \\, \\frac{1}{2} + (2 + \\mathbb{E}(W_{HH}))\\frac{1}{2} \\right) \\frac{1}{2} + \\left(1 + \\mathbb{E}(W_{HH}) \\right) \\frac{1}{2} \\\\\n", " &= \\left(\\frac{1}{2} + \\frac{1}{2} + \\frac{\\mathbb{E}(W_{HH})}{4} \\right) + \\left(\\frac{1}{2} + \\frac{\\mathbb{E}(W_{HH})}{2} \\right) \\\\\n", " &= \\frac{3}{2} + \\frac{3 \\, \\mathbb{E}(W_{HH})}{4} \\\\\n", " &= \\boxed{6}\n", "\\end{align}\n", "\n", "### Related application\n", "\n", "Genetics is a field where you might need to know about strings of letters, not $H,T$ but rather $A,C,T,G$.\n", "\n", "If you're interested here's a good [TED talk by Peter Donnelly on genetics and statistics](https://www.ted.com/talks/peter_donnelly_shows_how_stats_fool_juries/transcript?language=en)." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Conditioning on a Random Variable\n", "\n", "Consider $\\mathbb{E}(Y | X=x)$: what is $X=x$?\n", "\n", "It is an _event_, and we _condition on that event_.\n", "\n", "\\begin{align}\n", " &\\text{discrete case: } &\\quad &\\mathbb{E}(Y|X=x) = \\sum_{y} y \\, P(Y=y|X=x) \\\\\n", " \\\\\n", " &\\text{continuous case: } &\\quad &\\mathbb{E}(Y|X=x) = \\int_{-\\infty}^{\\infty} y \\, f_{Y|X}(y|x) \\, dy = \\int_{-\\infty}^{\\infty} y \\, \\frac{ f_{X,Y}(x,y) }{ f_{X}(x) } \\, dy\n", "\\end{align}\n", "\n", "Now let $g(x) = \\mathbb{E}(Y|X=x)$. This is a function of $Y$.\n", "\n", "Then define $\\mathbb{E}(Y|X) = g(X)$. e.g. if $g(x) = x^2$, then $g(X) = X^2$. So $\\mathbb{E}(Y|X)$ is itself a _random variable_, and rather than a function of $Y$, is a function of _$X$_." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Example with Poisson\n", "\n", "Let $X,Y$ be i.i.d. $\\operatorname{Pois}(\\lambda)$.\n", "\n", "\n", "#### $ \\mathbb{E}(X + Y | X)$\n", "\\begin{align}\n", " \\mathbb{E}(X + Y | X) &= \\mathbb{E}(X|X) + \\mathbb{E}(Y|X) \\\\\n", " &= \\underbrace{ X }_{ \\text{X is function of X} } + \\underbrace{ \\mathbb{E}(Y) }_{ \\text{independence} } \\\\\n", " &= X + \\lambda \n", "\\end{align}\n", "\n", "#### $\\mathbb{E}(X | X + Y)$\n", "\n", "Let $T = X + Y$, find the conditional PMF.\n", "\n", "\\begin{align}\n", " P(X=k|T=n) &= \\frac{P(T=n|X=k) \\, P(X=k)}{P(T=n)} &\\quad \\text{by Bayes' Rule} \\\\\n", " &= \\frac{P(Y=n-k) \\, P(X=k)}{P(T=n)} \\\\\n", " &= \\frac{ \\frac{e^{-\\lambda} \\, \\lambda^{n-k}}{(n-k)!} \\, \\frac{e^{-\\lambda} \\, \\lambda^{k}}{k!}}{ \\frac{e^{-2\\lambda} \\, (2\\lambda)^n}{n!} } \\\\\n", " &= \\frac{n!}{(n-k)! \\, k!} \\, \\left( \\frac{1}{2} \\right)^n \\\\\n", " &= \\binom{n}{k} \\, \\left( \\frac{1}{2} \\right)^n \\\\\n", " \\\\\n", " X | T=n &\\sim \\operatorname{Bin}(n, \\frac{1}{2}) \\\\\n", " \\\\\n", " \\mathbb{E}(X|T=n) &= \\frac{n}{2} \\Rightarrow \\mathbb{E}(X|T) = \\frac{T}{2}\n", "\\end{align}\n", "\n", "Alternately, notice the _symmetry_...\n", "\n", "\\begin{align}\n", " \\mathbb{E}(X|X+Y) &= \\mathbb{E}(Y|X+Y) &\\quad \\text{by symmetry (because i.i.d.)} \\\\\\\\\n", " \\\\\n", " \\mathbb{E}(X|X+Y) + \\mathbb{E}(Y|X+Y) &= \\mathbb{E}(X+Y|X+Y) \\\\\n", " &= X + Y \\\\\n", " &= T \\\\\n", " \\\\\n", " \\Rightarrow \\mathbb{E}(X|T) &= \\frac{T}{2}\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Foreshadowing: Iterated Expectation or Adam's Law\n", "\n", "The single most important property of conditional expection is closely related to the Law of Total Probability. \n", "\n", "Recall that $\\mathbb{E}(Y|X)$ is a random variable. That being so, it is natural to wonder what the expected value is.\n", "\n", "Consider this:\n", "\n", "\\begin{align}\n", " \\mathbb{E} \\left( \\mathbb{E}(Y|X) \\right) &= \\mathbb{E}(Y)\n", "\\end{align}\n", "\n", "We will go into more detail next time!\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "View [Lecture 26: Conditional Expectation Continued | Statistics 110](http://bit.ly/2oOXv6D) on YouTube." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }