{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 27: Conditional expectation (cont.); taking out what's known; Adam's Law, Eve's Law; projection picture\n", "\n", "\n", "## Stat 110, Prof. Joe Blitzstein, Harvard University\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conditioning on Random Variables\n", "\n", "### Ex. $\\mathbb{E}(Y|X)$ where $X \\sim N(0,1)$\n", "\n", "Let $X \\sim N(0,1)$ and $Y=X^2$.\n", "\n", "Then \n", "\n", "\\begin{align}\n", " \\mathbb{E}(Y|X) &= \\mathbb{E}(X^2|X) \\\\\n", " &= X^2 \\\\\n", " &= Y\n", "\\end{align}\n", "\n", "* this is simple enough, and very clear. \n", "\n", "But how about the other way 'round?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Ex. $\\mathbb{E}(X|Y)$ where $X \\sim N(0,1)$\n", "\n", "\\begin{align}\n", " \\mathbb{E}(X|Y) &= \\mathbb{E}(X|X^2) \\\\\n", " &= 0\n", "\\end{align}\n", "\n", "Why?\n", "\n", "* we don't know $X$, but what we are given is $X^2$\n", "* if we observe $x^2 = a$, then we know $x = \\pm \\sqrt{a}$\n", "* by _symmetry_, both $x=-\\sqrt{a}$ and $x=\\sqrt{a}$ are equally likely\n", "* hence the best estimate of $X$ would be... 0! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Uniform\n", "\n", "Say we have a stick of length 1, and we break it at point $x$. Then we take that stick of length $x$, and break that at point $y$.\n", "\n", "What is $\\mathbb{E}(Y|X)$?\n", "\n", "\n", "\n", "\n", "* $X \\sim \\operatorname{Unif}(0,1)$\n", "* $Y|X \\sim \\operatorname{Unif}(0,X)$\n", "\n", "\\begin{align}\n", " \\mathbb{E}(Y|X=x) &= 0 \\\\\n", " &= \\frac{x}{2} \\\\\n", " \\\\\n", " \\Rightarrow \\mathbb{E}(Y|X) &= \\frac{X}{2} \\\\\n", " \\\\\n", " \\mathbb{E} \\left( \\mathbb{E}(Y|X) \\right) &= \\frac{1}{4} \\\\\n", " &= \\mathbb{E}(Y)\n", "\\end{align}\n", "\n", "* the expected length of $y = \\frac{1}{4}$ is pretty intuitive; take a stick, break it in half, break that half in half again\n", "* we will get more into that $\\mathbb{E} \\left( \\mathbb{E}(Y|X) \\right) = \\mathbb{E}(Y)$ in a bit" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Useful Properties\n", "\n", "Here are some useful properties related to conditional expectation.\n", "\n", "\\begin{align}\n", " &\\text{(1) } \\mathbb{E}\\left( h(X) Y|X \\right) = h(X) \\, \\mathbb{E}(Y|X) &\\text{\"taking out what is known\"} \\\\\\\\\n", " &\\text{(2) } \\mathbb{E}(Y|X) = \\mathbb{E}(Y) &\\text{if } X,Y \\text{ are independent} \\\\\\\\\n", " &\\text{(3) } \\mathbb{E}\\left( \\mathbb{E}(Y|X) \\right) = \\mathbb{E}(Y) &\\text{Iterated Expectation, or Adam's Law} \\\\\\\\\n", " &\\text{(4) } \\mathbb{E}\\left( (Y - \\mathbb{E}(Y|X) \\, h(X) \\right) = 0 &\\text{residual is uncorrelated with } h(X) \\\\\\\\\n", " &\\text{(5) } \\operatorname{Var}(Y) = \\mathbb{E}\\left( \\operatorname{Var}(Y|X) \\right) + \\operatorname{Va}r\\left( \\mathbb{E}(Y|X) \\right) &\\text{EVvE's Law} \\\\\\\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Proof of Property 4\n", "\n", "Here is a pictorial explanation to aid your intuition.\n", "\n", "A vector could be anything (point, function, cow); as long as it follows the axioms of vector space, then anything can be treated as a vector.\n", "\n", "\n", "\n", "* The \"plane\" in the image represents all of the possible functions of $X$. As such, it neccessarily passes through the origin.\n", "* Conditional expectation is simply projecting $Y$ into the plane of all functions of $X$. \n", "* $\\mathbb{E}(Y|X)$ is the point in $X$ that is closest to $Y$.\n", "* If $Y$ is already a function of $X$, then $Y$ lies in that plane of $X$ functions.\n", "* If $Y$ is not a function of $X$, then the length of that projection is the _residual_.\n", "* in this image, we implicitly assume finite variance for all functions of $X$.\n", "\n", "So let us show that the residual $Y - \\mathbb{E}(Y|X)$ is uncorrelated with any function $h(X)$:\n", "\n", "\\begin{align}\n", " \\operatorname{Cov}\\left( Y - \\mathbb{E}(Y|X) , h(X) \\right) &= \\mathbb{E}\\left( (Y - \\mathbb{E}(Y|X)) \\, h(X) \\right) - \\mathbb{E}\\left(Y-\\mathbb{E}(Y|X)\\right) \\, \\mathbb{E}\\left(h(X)\\right) \\\\\n", " &= \\mathbb{E}\\left( (Y - \\mathbb{E}(Y|X)) \\, h(X) \\right) - \\left[\\mathbb{E}(Y) - \\mathbb{E}(Y) \\right] \\, \\mathbb{E}\\left(h(X)\\right) &\\text{ linearity, Adam's Law} \\\\\n", " &= \\mathbb{E}\\left( (Y - \\mathbb{E}(Y|X)) \\, h(X) \\right) - 0 \\\\\n", " &= \\mathbb{E}\\left( (Y - \\mathbb{E}(Y|X)) \\, h(X) \\right)\\\\\n", " &= \\mathbb{E}\\left( Y \\, h(X) \\right) - \\mathbb{E}\\left( \\mathbb{E}(Y|X) \\, h(X) \\right) \\\\\n", " &= \\mathbb{E}\\left( Y \\, h(X) \\right) - \\mathbb{E} \\left( \\mathbb{E}(Y \\, h(X))|X) \\right) & \\text{if we can take out, we can put back} \\\\\n", " &= \\mathbb{E}\\left( Y \\, h(X) \\right) - \\mathbb{E}\\left( Y \\, h(X) \\right) & \\text{Adam's Law} \\\\\n", " &= 0 &\\quad \\blacksquare\n", "\\end{align}\n", "\n", "And so the residual $Y - \\mathbb{E}(Y|X)$ is indeed uncorrelated with any function $h(X)$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Proof of Property 3\n", "\n", "Returning to Property 3, let's do the discrete case (but the continuous case is analogous).\n", "\n", "Since $\\mathbb{E}(Y|X)$ is just a function of $X$, we can call it by another name, say $g(X)$.\n", "\n", "\\begin{align}\n", " \\mathbb{E}\\left( \\mathbb{E}(Y|X) \\right) &= \\mathbb{E}\\left( g(X) \\right) \\\\\n", " &= \\sum_x g(x) \\, P(X=x) &\\text{by LOTUS, definition} \\\\\n", " &= \\sum_x \\mathbb{E}(Y|X=x) \\, P(X=x) \\\\\n", " &= \\sum_x \\left[ \\sum_y y \\, P(Y=y|X=x) \\right] P(X=x) \\\\ \n", " &= \\sum_y \\sum_x y \\, P(Y=y, X=x) \\\\\n", " &= \\sum_y y \\sum_x P(Y=y, X=x) \\\\\n", " &= \\sum_y y P(Y=y) \\\\\n", " &= \\mathbb{E}(Y) &\\quad \\blacksquare\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Conditional Variance\n", "\n", "Conditional variance is defined thusly:\n", "\n", "\\begin{align}\n", " \\operatorname{Var}(Y|X) &= \\mathbb{E}(Y^2|X) - \\mathbb{E}(Y|X)^2 &\\text{or alternately} \\\\\\\\\n", " &= \\mathbb{E}\\left[ (Y - \\mathbb{E}(Y|X))^2 | X \\right]\n", "\\end{align}\n", "\n", "#### Proof\n", "\n", "Let $g(X) = \\mathbb{E}(Y|X)$; this will make things a bit clearer.\n", "\n", "\\begin{align}\n", " \\operatorname{Var}(Y|X) &= \\mathbb{E}\\left[ (Y - \\mathbb{E}g(X))^2 | X \\right] \\\\\n", " &= \\mathbb{E}\\left[ Y^2 - 2Y \\, g(X) + g(X)^2 | X \\right] \\\\\n", " &= \\mathbb{E}(Y^2|X) - 2\\mathbb{E}(Y\\,g(X)|X) + \\mathbb{E}(g(X)^2 | X) \\\\\n", " &= \\mathbb{E}(Y^2|X) - 2 \\, g(X) \\, \\mathbb{E}(Y|X) + \\mathbb{E}(g(X)^2 | X) \\\\\n", " &= \\mathbb{E}(Y^2|X) - 2 \\, g(X) \\, g(X) + g(X)^2 \\\\\n", " &= \\mathbb{E}(Y^2|X) - 2 \\, g(X)^2 + g(X)^2 \\\\\n", " &= \\mathbb{E}(Y^2|X) - g(X)^2 \\\\\n", " &= \\mathbb{E}(Y^2|X) - \\mathbb{E}(Y|X)^2 &\\quad \\blacksquare \\\\\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Proof of Property 5\n", "\n", "EVvE's Law states that $\\operatorname{Var}(Y) = \\mathbb{E}\\left( \\operatorname{Var}(Y|X) \\right) + \\operatorname{Var}\\left( \\mathbb{E}(Y|X) \\right)$.\n", "\n", "\n", "\n", "Graphically, conditional variance deals with both the variance _within_ a sub-groups $\\mathbb{E}\\left( \\operatorname{Var}(Y|X) \\right)$, and the variance amongst the groups $\\operatorname{Var}\\left( \\mathbb{E}(Y|X) \\right)$.\n", "\n", "In order to prove EVvE's Law, we will do the following to make things simpler:\n", "\n", "* let $g(X) = \\mathbb{E}(Y|X)$\n", "* by Adam's Law, $\\mathbb{E}(g(X))=\\mathbb{E}(Y)$\n", "\n", "Then:\n", "\n", "\\begin{align}\n", " \\mathbb{E}\\left( \\operatorname{Var}(Y|X) \\right) &= \\mathbb{E}\\left[ \\mathbb{E}(Y^2|X) - (\\mathbb{E}(Y|X))^2 \\right] &\\text{ for the first part} \\\\\n", " &= \\mathbb{E}(Y^2) - \\mathbb{E}(g(X))^2 \\\\\n", " \\\\\n", " Var\\left( \\mathbb{E}(Y|X) \\right) &= \\operatorname{Var}(g(X)) &\\text{ for the second part} \\\\\n", " &= \\mathbb{E}(g(X))^2 - (\\mathbb{E}(g(X))^2 \\\\\n", " \\\\\n", " \\operatorname{Var}(Y) &= \\mathbb{E}(Y^2) - \\mathbb{E}\\left(g(X)\\right)^2 + \\mathbb{E}(g(X))^2 - (\\mathbb{E}\\left(g(X)\\right)^2 \\\\\n", " &= \\mathbb{E}(Y^2) - (\\mathbb{E}( \\mathbb{E}(Y|X) ))^2 \\\\\n", " &= \\mathbb{E}(Y^2) - (\\mathbb{E}(Y))^2 &\\quad \\blacksquare\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Example: Epidemiology and Conditional Variance\n", "\n", "Suppose we are studying infectious disease in a certain state. Due to circumstances (lack of resources and/or time), rather than taking samples across the state, we will randomly select a city and study a random sample of $n$ people there.\n", "\n", "Let $X$ be the number of infected people in the sample.\n", "\n", "Let $Q$ be the proportion of infected people in the randomly selected city. Keep in mind that different cities will have different proportions, hence $Q$ is a random variable.\n", "\n", "Find $\\mathbb{E}(X)$ and $\\operatorname{Var}(X)$.\n", "\n", "But to do this, we need to make an assumption about the distribution of $Q$. Given its flexibility, computational convenience and the fact that it is the conjugate prior to the binomial distribution, we will assume $Q \\sim \\operatorname{Beta}(a,b)$.\n", "\n", "It should be clear then that we are assuming that $X|Q \\sim \\operatorname{Bin}(n, Q)$. A hypergeometric might also work, but since $n$ is probably small compared to the total population size, and since we are sampling without replacement, we can choose to use the binomial along with the Beta distribution.\n", "\n", "_Remember that conditioning is the soul of statistics, and so we will condition on the proportion of infection $Q$ of our randomly selected city._" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## $\\mathbb{E}(X)$ via Adam's Law\n", "\n", "Thinking conditionally, we have:\n", "\n", "\\begin{align}\n", " \\mathbb{E}(X) &= \\mathbb{E}\\left( \\mathbb{E}(X|Q) \\right) \\\\\n", " &= \\mathbb{E}( nQ) &\\text{expected value of }\\operatorname{Bin}(n,Q) \\\\\n", " &= n \\, \\mathbb{E}(Q) \\\\\n", " &= n \\, \\frac{a}{a+b} &\\text{expected value of }\\operatorname{Beta}(a,b) \n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## $\\operatorname{Var}(X)$ via EVvE's Law\n", "\n", "Again thinking conditionally, we have:\n", "\n", "\\begin{align}\n", " \\operatorname{Var}(X) &= \\mathbb{E}\\left( \\operatorname{Var}(X|Q) \\right) + \\operatorname{Var}\\left( \\mathbb{E}(X|Q) \\right) &\\text{ by EVvE's Law} \\\\\n", " \\\\\n", " \\mathbb{E}\\left( \\operatorname{Var}(X|Q) \\right) &= \\mathbb{E}\\left( n \\, Q \\, (1-Q) \\right) &\\text{ for the first part} \\\\\n", " &= n \\, \\mathbb{E}\\left( Q \\, (1-Q) \\right) \\\\\n", " &= n \\, \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} \\, \\int_{0}^{1} q \\, (1-q) \\, q^{a-1} \\, (1-q)^{b-1} \\, dq &\\text{LOTUS} \\\\\n", " &= n \\, \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} \\, \\int_{0}^{1} q^{a} \\, (1-q)^{b} \\, dq \\\\\n", " &= n \\, \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} \\, \\frac{\\Gamma(a+1)\\Gamma(b+1)}{\\Gamma(a+b+2)} &\\text{that is another }Beta \\\\\n", " &= n \\, \\frac{\\Gamma(a+b)}{\\Gamma(a)\\Gamma(b)} \\, \\frac{a\\Gamma(a)b\\Gamma(b)}{(a+b+1)(a+b)\\Gamma(a+b)} \\\\\n", " &= \\frac{n \\, a \\, b}{(a+b+1)(a+b)} \\\\\n", " \\\\\n", " Var\\left( \\mathbb{E}(X|Q) \\right) &= Var(n \\, Q) &\\text{ for the second part} \\\\\n", " &= n^2 \\, \\operatorname{Var}(Q) \\\\\n", " &= n^2 \\, \\frac{\\mu(1-\\mu)}{a+b+1} &\\text{where } \\mu = \\frac{a}{a+b} \\\\\n", "\\end{align}\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "View [Lecture 27: Conditional Expectation given an R.V. | Statistics 110](http://bit.ly/2NTiQXk) on YouTube." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }