{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.6" }, "colab": { "name": "confidence-intervals.ipynb", "provenance": [], "include_colab_link": true } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "id": "JLUFXFfZMIEv" }, "source": [ "# Illustration of Confidence Intervals" ] }, { "cell_type": "markdown", "metadata": { "id": "Fakq773QMIEw" }, "source": [ "**Example 4.1. Illustration of Confidence Intervals.** \n", "Assume a sample of N= 31 independent observations are collected \n", "from a normally distributed random variable x with the following results:\n", "\n", " 60, 61, 47, 56, 61, 63, \n", " 65, 69, 54, 59, 43, 61, \n", " 55, 61, 56, 48, 67, 65, \n", " 60, 58, 57, 62, 57, 58, \n", " 53, 59, 58, 61, 67, 62, 54\n", "\n", "Determine a 90% confidence interval for the mean value and variance of the random variable x." ] }, { "cell_type": "markdown", "metadata": { "id": "yMYYmlojMIEw" }, "source": [ "## Solution\n", "Let's go to chapter 4 (Statistical Principals), section 4 (**Confidence Intervals**) of J. Bendat, A Piersol." ] }, { "cell_type": "code", "metadata": { "scrolled": true, "id": "qMULZFmzMIEx", "outputId": "42068872-fd08-41ec-a43c-d67e1314b986" }, "source": [ "from IPython.display import IFrame\n", "IFrame('pdfs/J-Bendat--A-Piersol-random-data-analysis-and-measurement-procedures-section-4-4.pdf',\n", " width='100%', height=400)" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": { "tags": [] }, "execution_count": 5 } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 336 }, "id": "Z977f6_GMIEy", "outputId": "d3c7b638-0e58-4cc1-eb01-dc1c29149ce8" }, "source": [ "import numpy as np\n", "from scipy import stats\n", "import seaborn as sns\n", "%matplotlib inline\n", "\n", "# The data\n", "X = np.array([60, 61, 47, 56, 61, 63, 65, 69, 54, 59, 43, 61, 55, 61,\n", " 56, 48, 67, 65, 60, 58, 57, 62, 57, 58, 53, 59, 58, 61, 67, 62, 54])\n", "\n", "# Let's visualize the distribution\n", "sns.distplot(X)" ], "execution_count": 1, "outputs": [ { "output_type": "stream", "text": [ "/usr/local/lib/python3.7/dist-packages/seaborn/distributions.py:2557: FutureWarning: `distplot` is a deprecated function and will be removed in a future version. Please adapt your code to use either `displot` (a figure-level function with similar flexibility) or `histplot` (an axes-level function for histograms).\n", " warnings.warn(msg, FutureWarning)\n" ], "name": "stderr" }, { "output_type": "execute_result", "data": { "text/plain": [ "" ] }, "metadata": { "tags": [] }, "execution_count": 1 }, { "output_type": "display_data", "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "tags": [], "needs_background": "light" } } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "a9MLchnTMIEz", "outputId": "b1ce9490-a7b3-4e2b-d82f-4beba28526d2" }, "source": [ "# Number of data\n", "N = float(len(X))\n", "\n", "# We estimate the sample mean\n", "mu = X.mean()\n", "\n", "# We estimate the sample variance and standar deviation with N-1 degrees of freedom\n", "s2 = X.var(ddof=1)\n", "s = X.std(ddof=1)\n", "\n", "# We print the values\n", "print('mean = %4.2f' % mu)\n", "print('s2 = %4.2f' % s2)\n", "print('s = %4.2f' % s)\n", "\n", "\n", "# We need to determine confidence intervals with 90% confidence\n", "# 90% confidence means an alpha = 0.1\n", "# The confidence intervals for μ_x are given by Eq. (4.46b)\n", "\n", "# We need the t-student statistic for t(1-alpha/2,N-1)\n", "t = stats.t.ppf(1-0.1/2,N-1) # ppf: percent point function (inverse of cdf)\n", "\n", "# The confidence interval\n", "conf_int_mu = (mu-t*s/np.sqrt(N),mu+t*s/np.sqrt(N))\n", "\n", "print('confidence interval for μ_x: (%2.2f < %2.2f < %2.2f)' % (conf_int_mu[0],mu,conf_int_mu[1]))\n", "# Another way to do it using stats.t.interval \n", "print('confidence interval for μ_x: (%2.2f, %2.2f)' % stats.t.interval(1-0.1, N-1, loc=mu, scale=s/np.sqrt(N-1)))" ], "execution_count": 2, "outputs": [ { "output_type": "stream", "text": [ "mean = 58.61\n", "s2 = 33.45\n", "s = 5.78\n", "confidence interval for μ_x: (56.85 < 58.61 < 60.38)\n", "confidence interval for μ_x: (56.82, 60.40)\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "kTxJrBmEMIE0" }, "source": [ "## Additional work \n", "1. Compute the 90% confidence interval for the variance $s^2_x$. Use Eq. (4.47) and the function [``stats.chi2.ppf()``.](http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2.html#scipy.stats.chi2 \"scipy.stats.chi2 — SciPy v0.17.0 Reference Guide\")\n", "2. Compute the 95% confidence intervals for the mean $\\mu_x$ and variance $s^2_x$." ] } ] }