{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#Statistical Inference for Everyone: Technical Supplement\n",
    "\n",
    "\n",
    "\n",
    "This document is the technical supplement, for instructors, for [Statistical Inference for Everyone], the introductory statistical inference textbook from the perspective of \"probability theory as logic\".\n",
    "\n",
    "<img  src=\"http://web.bryant.edu/~bblais/images/Saturn_with_Dice.png\" align=center width = 250px />\n",
    "\n",
    "[Statistical Inference for Everyone]: http://web.bryant.edu/~bblais/statistical-inference-for-everyone-sie.html\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estimating the Paired-Data Difference Between Means, $\\delta_k \\equiv x_k-y_k$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$\\newcommand{\\bvec}[1]{\\mathbf{#1}}$\n",
    "\n",
    "We want\n",
    "\\begin{eqnarray}\n",
    "p(\\mu_\\delta|\\bvec{x},\\bvec{y},\\sigma_x,\\sigma_y,I)\n",
    "\\end{eqnarray}\n",
    "where  $\\delta_k \\equiv x_k-y_k$.\n",
    "\n",
    "We have from the Normal model the following likelihoods for $x_k$ and\n",
    "$y_k$:\n",
    "\\begin{eqnarray}\n",
    "p(x_k|\\mu,\\sigma_x,I)&=&\\frac{1}{\\sqrt{2\\pi\\sigma_x^2}}e^{-(x_k -\n",
    "\\mu_x)^2/2\\sigma_x^2}\\\\\\\\\n",
    "p(y_k|\\mu,\\sigma_y,I)&=&\\frac{1}{\\sqrt{2\\pi\\sigma_y^2}}e^{-(y_k -\n",
    "\\mu_y)^2/2\\sigma_y^2}\n",
    "\\end{eqnarray}\n",
    "\n",
    "Now we need to find the likelihood function for $\\delta_k \\equiv x_k-y_k$. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Changing Variables\n",
    "\n",
    "If we have $Z=f(X,Y)$, and we know about $X$ and $Y$, we can learn about $Z$.\n",
    "\\begin{eqnarray}\n",
    "p(Z|I)&=&\\int \\int p(Z|X,Y,I) \\times p(X,Y|I) dXdY \\\\\\\\\n",
    "&=&\\int \\int \\delta(Z-f(X,Y)) \\times p(X,Y|I) dXdY\n",
    "\\end{eqnarray}\n",
    "\n",
    "Say, $Z=X-Y$, and $X$ and $Y$ are independent, then $p(X,Y|I)=p(X|I)p(Y|I)$\n",
    "and we have \n",
    "\\begin{eqnarray}\n",
    "p(Z|I) &=& \\int dX p(X,I) \\int dY p(Y|I)\\delta(Z-X+Y) \\\\\\\\\n",
    "&=&  \\int dX p(X,I)p(Y=X-Z|I)\n",
    "\\end{eqnarray}\n",
    "\n",
    "Further, if the probabilities are Gaussian, then we have\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(Z|I) &=& \\frac{1}{2\\pi\\sigma_x\\sigma_y}\\int_{-\\infty}^{\\infty} dX\n",
    "e^{-(X-\\mu_x)^2/2\\sigma_x^2}\\times e^{-(X-Z-\\mu_y)^2/2\\sigma_y^2} \n",
    "\\end{eqnarray}\n",
    "One can do some pretty boring algebra at this point (factoring the exponents),\n",
    "or use a program like *xmaxima*:\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "    (C1) ASSUME_POS:TRUE;\n",
    "    (D1)                                 TRUE\n",
    "    (C2) 1/(2*%PI)/sx/sy*integrate(exp(-(x-xo)^2/(2*sx^2))*\n",
    "               exp(-(x-z-yo)^2/(2*sy^2)),x,-inf,inf);\n",
    "\n",
    "                             2                       2               2\n",
    "                            z  + (2 yo - 2 xo) z + yo  - 2 xo yo + xo\n",
    "                          - ------------------------------------------\n",
    "                                              2       2\n",
    "                                          2 sy  + 2 sx\n",
    "                SQRT(2) %E\n",
    "    (D2)        ------------------------------------------------------\n",
    "                                                2     2\n",
    "                             2 SQRT(%PI) SQRT(sy  + sx )\n",
    "\n",
    "    (C3) factor(z^2+(2*yo-2*xo)*z+yo^2-2*xo*yo+xo^2);\n",
    "                                                 2\n",
    "    (D3)                            (z + yo - xo)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "So we get\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(Z|I) &=&\n",
    "\\frac{1}{\\sqrt{2\\pi(\\sigma_x^2+\\sigma_y^2)}}\n",
    "e^{-(z-(\\mu_x-\\mu_y))^2/2(\\sigma_x^2+\\sigma_y^2)} \\\\\\\\\n",
    "&=&\\frac{1}{\\sqrt{2\\pi\\sigma_z}}\n",
    "e^{-(z-\\mu_z)^2/2\\sigma_z} \\\\\\\\ \\mbox{ where }\n",
    "\\mu_z&\\equiv& \\mu_x-\\mu_y \\\\\\\\\n",
    "\\sigma_z^2&\\equiv&\\sigma_x^2+\\sigma_y^2\n",
    "\\end{eqnarray}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Continuing with Paired Data\n",
    "\n",
    "Changing variables to $\\delta_k$, it is clear that the likelihood for\n",
    "$\\delta_k$ is the same form as $\\delta_x$ and $\\delta_y$.  Thus we have the\n",
    "*exact same* results on the paired difference, both for known and unknown\n",
    "$\\sigma$, quoted in z-test and t-test sections.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Difference of Means, $\\delta\\equiv \\mu_x - \\mu_y$, known $\\sigma_x$ and $\\sigma_y$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Again, the change of variables trick works, but since we are given the means\n",
    "($\\mu_x$ and $\\mu_y$) we need to use the posterior distributions,\n",
    "$p(\\mu_x|\\bvec{x},\\sigma_x,I)$ and $p(\\mu_y|\\bvec{y},\\sigma_y,I)$.\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(\\mu_x|\\bvec{x},\\sigma_x,I)&=& \n",
    "\\sqrt{\\frac{n}{2\\pi \\sigma_x^2}}e^{-n(\\bar{x}-\\mu_x)^2/2\\sigma_x^2} \\\\\\\\\n",
    "p(\\mu_y|\\bvec{y},\\sigma_y,I)&=& \n",
    "\\sqrt{\\frac{m}{2\\pi \\sigma_y^2}}e^{-n(\\bar{y}-\\mu_y)^2/2\\sigma_y^2}\n",
    "\\end{eqnarray}\n",
    "\n",
    "Performing the change of variables to $\\delta \\equiv \\mu_x-\\mu_y$ we get\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(\\delta|\\bvec{x},\\bvec{y},\\sigma_x,\\sigma_y,I)&=&\n",
    "\\frac{\\sqrt{nm}}{2\\pi\\sigma_x\\sigma_y}\\int d\\mu_y\n",
    "e^{-n(\\bar{x}-\\delta-\\mu_y)^2/2\\sigma_x^2}\n",
    "e^{-m(\\bar{y}-\\mu_y)^2/2\\sigma_y^2} \n",
    "\\end{eqnarray}\n",
    "\n",
    "Again, using *xmaxima*,\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "    (C1) ASSUME_POS:TRUE;\n",
    "\n",
    "    (D1)                                 TRUE\n",
    "    (C2) f(d):=sqrt(n*m)/(2*%PI*sx*sy)*integrate(exp(-n*(xbar-d-my)^2/(2*sx^2))*\n",
    "                      exp(-m*(ybar-my)^2/(2*sy^2)),my,-inf,inf);\n",
    "    f(d);\n",
    "\n",
    "                                                                2\n",
    "                  SQRT(n m)                (- n) (xbar - d - my)\n",
    "    (D2) f(d) := ----------- INTEGRATE(EXP(----------------------)\n",
    "                 2 %PI sx sy                           2\n",
    "                                                   2 sx\n",
    "\n",
    "                                                                2\n",
    "                                               (- m) (ybar - my)\n",
    "                                           EXP(------------------), my, - INF, INF)\n",
    "                                                         2\n",
    "                                                     2 sy\n",
    "    (C3) \n",
    "                                                     2\n",
    "    (D3) SQRT(2) SQRT(m) SQRT(n) EXPT(%E, - (m n ybar\n",
    "\n",
    "                                               2                   2\n",
    "     + (- 2 m n xbar + 2 d m n) ybar + m n xbar  - 2 d m n xbar + d  m n)\n",
    "\n",
    "            2         2                         2       2\n",
    "    /(2 n sy  + 2 m sx ))/(2 SQRT(%PI) SQRT(n sy  + m sx ))\n",
    "    (C4) factor((m*n)*ybar^2+(-2*m*n*xbar+2*d*m*n)*ybar+m*n*xbar^2-2*d*m*n*xbar+m*n*d^2);\n",
    "\n",
    "                                                     2\n",
    "    (D4)                        m n (ybar - xbar + d)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Rewritten, this is\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(\\delta|\\bvec{x},\\bvec{y},\\sigma_x,\\sigma_y,I)&=&\n",
    "\\sqrt{\\frac{nm}{2\\pi(n\\sigma_x^2+m\\sigma_y^2)}}\n",
    "    e^{-mn(\\delta-(\\bar{x}-\\bar{y}))^2/2(n\\sigma_x^2+m\\sigma_y^2)}\n",
    "\\end{eqnarray}\n",
    "\n",
    "or\n",
    "\n",
    "\\begin{eqnarray}\n",
    "\\mu_\\delta &\\equiv& \\mu_x-\\mu_y \\\\\n",
    "\\sigma_\\delta &\\equiv& \\frac{\\sigma_x^2}{n}+\\frac{\\sigma_y^2}{m}\\\\\n",
    "p(\\delta|\\bvec{x},\\bvec{y},\\sigma_x,\\sigma_y,I)&=&\n",
    "\\frac{1}{\\sqrt{2\\pi\\sigma_\\delta^2}}\n",
    "    e^{-(\\delta-\\mu_\\delta)^2/2\\sigma_\\delta^2}\n",
    "\\end{eqnarray}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Difference of Means, $\\delta\\equiv \\mu_x - \\mu_y$, unknown $\\sigma_x$ and $\\sigma_y$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Making definitions as before for the $t$ distribution for each variable\n",
    "\n",
    "\\begin{eqnarray}\n",
    "t_x&\\equiv&\\frac{\\mu_x-\\bar{x}}{S_x/\\sqrt{n}} \\\\\\\\\n",
    "t_y&\\equiv&\\frac{\\mu_y-\\bar{y}}{S_y/\\sqrt{n}} \\\\\\\\\n",
    "S_x^2&\\equiv&\\frac{1}{(n-1)}\\sum_{k=1}^{n} (x_k-\\mu_x)^2 \\\\\\\\\n",
    "S_y^2&\\equiv&\\frac{1}{(m-1)}\\sum_{k=1}^{m} (y_k-\\mu_y)^2\n",
    "\\end{eqnarray}\n",
    "From the addition of variables we get\n",
    "\\begin{eqnarray}\n",
    "t&\\equiv&\\frac{\\delta-(\\bar{x}-\\bar{y})}{\\sqrt{S_x^2/m+S_y^2/n}} \\\\\\\\\n",
    "\\tan \\theta &\\equiv& \\frac{S_x/\\sqrt{n}}{S_y/\\sqrt{m}}\n",
    "\\end{eqnarray}\n",
    "\n",
    "$\\tan \\theta$ depends on the data, and $t_x$, and $t_y$ are known, so the\n",
    "distribution for $t$ should be known.  It is named the Behren's distribution.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---------------------"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
       "  }\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunsx.otf');\n",
       "    font-weight: bold;\n",
       "  }\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunsi.otf');\n",
       "    font-style: italic, oblique;\n",
       "  }\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunbxo.otf');\n",
       "    font-weight: bold;\n",
       "    font-style: italic, oblique;\n",
       "  }\n",
       "\n",
       "    div.cell{\n",
       "        width:800px;\n",
       "        margin-left:16% !important;\n",
       "        margin-right:auto;\n",
       "    }\n",
       "    h1 {\n",
       "        font-family: Garamond, Times, serif;\n",
       "    }\n",
       "    h4{\n",
       "        margin-top:12px;\n",
       "        margin-bottom: 3px;\n",
       "       }\n",
       "    div.text_cell_render{\n",
       "        font-family: Garamond, Times, serif;\n",
       "        line-height: 145%;\n",
       "        font-size: 130%;\n",
       "        width:800px;\n",
       "        margin-left:auto;\n",
       "        margin-right:auto;\n",
       "    }\n",
       "    .CodeMirror{\n",
       "            font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
       "    }\n",
       "    .prompt{\n",
       "        display: None;\n",
       "    }\n",
       "    .text_cell_render h5 {\n",
       "        font-weight: 300;\n",
       "        font-size: 22pt;\n",
       "        color: #4057A1;\n",
       "        font-style: italic;\n",
       "        margin-bottom: .5em;\n",
       "        margin-top: 0.5em;\n",
       "        display: block;\n",
       "    }\n",
       "    \n",
       "    .warning{\n",
       "        color: rgb( 240, 20, 20 )\n",
       "        }  \n",
       "</style>\n",
       "<script>\n",
       "    MathJax.Hub.Config({\n",
       "                TeX: {\n",
       "                    extensions: [\"AMSmath.js\"]\n",
       "                },\n",
       "                tex2jax: {\n",
       "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
       "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
       "                },\n",
       "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
       "                \"HTML-CSS\": {\n",
       "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
       "                }\n",
       "        });\n",
       "</script>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML at 0x105c39e10>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython.core.display import HTML\n",
    "\n",
    "\n",
    "def css_styling():\n",
    "    styles = open(\"../styles/custom.css\", \"r\").read()\n",
    "    return HTML(styles)\n",
    "css_styling()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}