{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Chapter 10: Inequalities and limit theorems\n", " \n", "This Jupyter notebook is the Python equivalent of the R code in section 10.6 R, pp. 447 - 450, [Introduction to Probability, 1st Edition](https://www.crcpress.com/Introduction-to-Probability/Blitzstein-Hwang/p/book/9781466575578), Blitzstein & Hwang.\n", "\n", "----" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Jensen's inequality\n", "\n", "Python/NumPy/SciPy make it easy to compare the expectations of $X$ and $g(X)$ for a given choice of $g$, and this allows us to verify some special cases of Jensen's inequality. For example, suppose we simulate 104 times from the $Expo(1)$ distribution:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "np.random.seed(24157817)\n", "\n", "from scipy.stats import expon\n", "\n", "x = expon.rvs(size=10**4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "According to Jensen's inequality, $\\mathbb{E}(log \\, X) \\leq log \\, \\mathbb{E} \\, X$. The former can be approximated by numpy.mean(numpy.log(x)) and the latter can be approximated by numpy.log(numpy.mean(x)), so compute both" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "numpy.mean(numpy.log(x)) = -0.5600958563379892\n", "numpy.log(numpy.mean(x)) = 0.00800014338803644\n" ] } ], "source": [ "meanlog = np.mean(np.log(x))\n", "print('numpy.mean(numpy.log(x)) = {}'.format(meanlog))\n", "\n", "logmean = np.log(np.mean(x))\n", "print('numpy.log(numpy.mean(x)) = {}'.format(logmean))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the $Expo(1)$ distribution, we find that numpy.mean(numpy.log(x)) is approximately −0.56 (the true value is around −0.577), while numpy.log(numpy.mean(x)) is approximately 0 (the true value is 0). This indeed suggests $\\mathbb{E}(log \\, X) \\leq log \\, \\mathbb{E} \\, X$. We could also compare numpy.mean(x**3) to numpy.mean(x)**3, or numpy.mean(numpy.sqrt(x)) to numpy.sqrt(numpy.mean(x)) - the possibilities are endless." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualization of the law of large numbers\n", "\n", "To plot the running proportion of Heads in a sequence of independent fair coin tosses, we first generate the coin tosses themselves:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "np.random.seed(39088169)\n", "\n", "from scipy.stats import binom\n", "\n", "nsim = 300\n", "p = 1/2\n", "x = binom.rvs(1, p, size=nsim)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we compute $\\bar{X}_n$ for each value of $n$ and store the results in xbar:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# divide by sequence from 1 to nsim, inclusive\n", "xbar = np.cumsum(x) / np.arange(1, nsim+1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above line of code performs element-wise division of the two arrays numpy.cumsum(x) and np.arange(1, nsim+1). Finally, we plot xbar against the number of coin tosses:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "text/plain": [