{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Probability Inequalities and Identities\n", "\n", "This follows Lugosi (2006) rather closely; it also a repeat of the results in the chapter on \n", "[Mathematical Preliminaries](mathPrelim.ipynb), which we largely omitted.\n", "\n", "#### The tail-integral formula for expectation\n", "If $X$ is a nonnegative real-valued random variable,\n", "$$\n", "{\\mathbb E} X = \\int_0^\\infty {\\mathbb P} \\{X \\ge x\\} dx\n", "$$\n", "\n", "#### Jensen's Inequality\n", "If $\\phi$ is a convex function from ${\\mathcal X}$ to $\\Re$, then $\\phi({\\mathbb E} X) \\le {\\mathbb E} \\phi(X)$.\n", "\n", "#### Markov's, Chebychev's, and related inequalities\n", "From the tail-integral formula,\n", "\n", "$$\n", "{\\mathbb E} X \\ge \\int_0^t {\\mathbb P} \\{X \\ge x\\} dx \\ge t {\\mathbb P} \\{X \\ge t \\},\n", "$$\n", "\n", "which yields _Markov's Inequality_:\n", "\n", "$$\n", "{\\mathbb P} \\{ X \\ge t \\} \\le \\frac{{\\mathbb E} X}{t}.\n", "$$\n", "\n", "Moreover, for any strictly monotonic function $f$ and nonnegative $X$,\n", "we have the _Generalized Markov Inequality_:\n", "\n", "$$\n", "{\\mathbb P} \\{ X \\ge t \\} = {\\mathbb P} \\{ f(X) \\ge f(t) \\} \\le \\frac{{\\mathbb E} f(X)}{f(t)}.\n", "$$\n", "\n", "In particular, for any real-valued $X$ and real $q > 0$,\n", "$| X - {\\mathbb E} X |$ is a nonnegative\n", "random variable and $f(x) = x^q$ is strictly monotonic, so\n", "\n", "$$\n", "{\\mathbb P} \\{| X - {\\mathbb E} X | \\ge t \\} = {\\mathbb P} \\{ | X - {\\mathbb E} X |^q \\ge t^q \\} \\le\n", "\\frac{{\\mathbb E} | X - {\\mathbb E} X |^q}{t^q}.\n", "$$\n", "\n", "_Chebychev's inequality_ is a special case:\n", "\n", "$$\n", "{\\mathbb P} \\{ | X - {\\mathbb E} X |^2 \\ge t^2 \\} \\le \\frac{{\\mathbb E} | X - {\\mathbb E} X |^2}{t^2}\n", "= \\frac{{\\mbox{Var}} X}{t^2}.\n", "$$\n", "\n", "#### The Chernoff Bound\n", "Apply the Generalized Markov Inequality with $f(x) = e^{sx}$ for positive $s$\n", "to obtain the _Chernoff Bound_:\n", "$$\n", "{\\mathbb P} \\{ X \\ge t \\} = {\\mathbb P} \\{ e^{sX} \\ge e^{st} \\} \\le \\frac{{\\mathbb E} e^{sX}}{e^{st}}\n", "$$\n", "for all $s$.\n", "For particular $X$, one can optimize over $s$.\n", "\n", "\n", "#### Hoeffding's Inequality\n", "Let $\\{ X_j \\}_{j=1}^n$ be independent, and define\n", "$S_n \\equiv \\sum_{j=1}^n X_j$.\n", "Applying the Chernoff Bound yields\n", "$$\n", "{\\mathbb P} \\{ S_n - {\\mathbb E} S_n \\ge t \\} \\le e^{-st} {\\mathbb E} e^{sS_n} =\n", "e^{-st} \\prod_{j=1}^n e^{s(X_j - E X_j)}.\n", "$$\n", "Hoeffding bounds the moment generating function for a bounded random variable\n", "$X$:\n", "If $a \\le X \\le b$ with probability 1, then\n", "$$\n", "{\\mathbb E} e^{sX} \\le e^{s^2 (b-a)^2/8},\n", "$$\n", "from which follows _Hoeffding's tail bound_.\n", "\n", "If $\\{X_j\\}_{j=1}^n$ are independent and ${\\mathbb P} \\{a_j \\le X_j \\le b_j\\} = 1$,\n", "then\n", "$$ \n", "{\\mathbb P} \\{ S_n - {\\mathbb E} S_n \\ge t \\} \\le e^{-2t^2/\\sum_{j=1}^n (b_j - a_j)^2}\n", "$$\n", "and\n", "$$ \n", "{\\mathbb P} \\{ S_n - {\\mathbb E} S_n \\le -t \\} \\le e^{-2t^2/\\sum_{j=1}^n (b_j - a_j)^2}\n", "$$\n", "\n", "#### Hoeffding's Other Inequality\n", "Suppose $f$ is a convex, real function and ${\\mathcal X}$ is a finite set.\n", "Let $\\{X_j \\}_{j=1}^n$ be a simple random sample from ${\\mathcal X}$ and\n", "let $\\{Y_j \\}_{j=1}^n$ be an iid uniform random sample (with replacement) from ${\\mathcal X}$.\n", "Then\n", "$$ \n", "{\\mathbb E} f \\left ( \\sum_{j=1}^n X_j \\right ) \\le {\\mathbb E} f \\left ( \\sum_{j=1}^n Y_j \\right ).\n", "$$\n", "\n", "#### Bernstein's Inequality\n", "Suppose $\\{X_j \\}_{j=1}^n$ are independent with ${\\mathbb E} X_j = 0$ for all $j$,\n", "${\\mathbb P} \\{ | X_j | \\le c\\} = 1$,\n", "$\\sigma_j^2 = {\\mathbb E} X_j^2$ and $\\sigma = \\frac{1}{n} \\sum_{j=1}^n \\sigma_j^2$.\n", "Then for any $\\epsilon > 0$,\n", "$$\n", "{\\mathbb P} \\{ S_n/n > \\epsilon \\} \\le e^{-n \\epsilon^2 / 2(\\sigma^2 + c\\epsilon/3)}.\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Previous: [Random Variables, Expectation, & Random Vectors](rv.ipynb)\n", "Next: [Simulation](simulate.ipynb)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "\n", "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%run talktools.py" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }