{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CHAPTER 16. Metric-Predicted Variable on One or Two Groups" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 싸이그래머 / 베이지안R [1]\n", "* 김무성" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Contents" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 16.1. Estimating the Mean and Standard Deviation of a Normal Distribution\n", "* 16.2. Outliers and Robust Estimation : The t Distribution\n", "* 16.3. Two Groups\n", "* 16.4. Other Noise Distributions and Transforming Data\n", "* 16.5. EXERCISES" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this chapter, we consider a situation in which we have a metric-predicted variable that is observed for items from one or two groups." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* For example, \n", " - we could measure \n", " - the blood pressure (i.e., a metric variable) for people \n", " - randomly sampled from first-year university students (i.e., a single group).\n", " - In this case, we might be interested in \n", " - how much the group’s typical blood pressure differs \n", " - from the recommended value for people of that age as published by a federal agency." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* As another example, \n", " - we could measure \n", " - the IQ (i.e., a metric variable) of people \n", " - randomly sampled from everyone self-described as vegetarian (i.e., a single group). \n", " - In this case, we could be interested in \n", " - how much this group’s IQ differs \n", " - from the general population’s average IQ of 100." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* In the context of the generalized linear model (GLM) introduced in the previous chapter, this chapter’s situation involves the most trivial cases of the linear core of the GLM, as indicated in the left cells of Table 15.1 (p. 434), with a link function that is the identity along with a normal distribution for describing noise in the data, as indicated in the first row of Table 15.2 (p. 443). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* We will explore options for the prior distribution on parameters of the normal distribution, and methods for Bayesian estimation of the parameters. \n", "* We will also consider alternative noise distributions for describing data that have outliers." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 16.1. Estimating the Mean and Standard Deviation of a Normal Distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 16.1.1 Solution by mathematical analysis\n", "* 16.1.2 Approximation by MCMC in JAGS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The normal distribution specifies the probability density of a value y, given the values of two\n", "parameters, the mean μ and standard deviation σ :" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* To get an intuition for the normal distribution as a likelihood function, consider three data values y1 = 85, y2 = 100, and y3 = 115, which are plotted as large dots in Figure 16.1. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Figure 16.1 shows p(D|μ, σ ) for different values of μ and σ . As you can see, there are values of μ and σ that make the data most probable, but other nearby values also accommodate the data reasonably well" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The question is, given the data, how should we allocate credibility to combinations of μ and σ?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* likelhood \n", " - Figure 16.1 shows examples of p(D|μ, σ ) for a particular data set at different values of μ and σ \n", "* prior \n", " -The prior, p(μ,σ), specifies the credibility of each combination of μ,σ values in the two-dimensional joint parameter space, without the data.\n", "* posterior\n", " - Bayes’ rule says that the posterior credibility of each combination of μ, σ values is the prior credibility times the likelihood, normalized by the marginal likelihood." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our goal now is to evaluate Equation 16.2 for reasonable choices of the prior distribution, p(μ, σ )." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 16.1.1 Solution by mathematical analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* We take a short algebraic tour before moving on to MCMC implementations." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### When σ is fixed, " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 참고 : Conjugate prior\n", " - https://en.wikipedia.org/wiki/Conjugate_prior" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "* It is convenient first to consider the case in which the standard deviation of the likelihood function is fixed at a specific value. In other words, the prior distribution on σ is a spike over that specific value. We’ll denote that fixed value as σ = Sy. \n", "* With this simplifying assumption, we are only estimating μ because we are assuming perfectly certain prior knowledge about σ .\n", "* proir : When σ is fixed, then the prior distribution on μ in Equation 16.2 can be easily chosen to be conjugate to the normal likelihood.\n", " - The term “conjugate prior” was defined in Section 6.2, p. 126.\n", " - It turns out that the product of normal distributions is again a normal distribution; \n", " - in other words, if the prior on μ is normal, then the posterior on μ is normal.\n", " - Let the prior distribution on μ be normal with mean Mμ and standard deviation Sμ .\n", "* likelihood * prior : " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### precision & posterior precision" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Thus, the posterior precision is the sum of the prior precision and the likelihood precision." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### posterior mean" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* In other words, the posterior mean is a weighted average of the prior mean and the datum, with the weighting corresponding to the relative precisions of the prior and the likelihood." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### N value case" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### posterior mean" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### posterior precision" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Notice that as the sample size N increases, the posterior mean is dominated by the data mean.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### When μ is fixed, " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* We can instead estimate the σ parameter when μ is fixed. \n", "* It turns out that when μ is fixed, a conjugate prior for the precision is the gamma distribution (e.g., Gelman et al., 2013, p. 43). \n", " - It is important to understand the meaning of a gamma prior on precision. \n", " - Consider a gamma distribution \n", " - that is loaded heavily over very small values, but has a long shallow tail extending over large values. \n", " - This sort of gamma distribution on precision indicates that we believe most strongly in small precisions, but we admit that large precisions are possible. \n", " - If this is a belief about the precision of a normal likelihood function, then this sort of gamma distribution expresses a belief that the data will be more spread out, because small precisions imply large standard deviations. \n", " - If the gamma distribution is instead loaded over large values of precision, \n", " - it expresses a belief that the data will be tightly clustered." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### 참고: gamma distribution ?\n", "\n", "* https://en.wikipedia.org/wiki/Gamma_distribution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### conjugate priors & gamma distiribution & precision" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Because of its role in conjugate priors for the normal likelihood function, the gamma distribution is routinely used as a prior on precision. \n", "* But there is no logical necessity to do so, and modern MCMC methods permit more flexible specification of priors. \n", "* Indeed, because precision is less intuitive than standard deviation, it can be more useful instead to give the standard deviation a uniform prior that spans a wide range." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Summary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We have assumed that the data are generated by a normal likelihood function,\n", "parameterized by a mean μ and standard deviation σ , and denoted y ∼ normal(y|μ, σ ).\n", "For purposes of mathematical derivation, we made unrealistic assumptions that the prior\n", "distribution is either a spike on σ or a spike on μ, in order to make three main\n", "points:\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. A natural way to express a prior on μ is with a normal distribution, because this is conjugate with the normal likelihood when its standard deviation is fixed.\n", "2. A way to express a prior on the precision 1/σ 2 is with a gamma distribution, because this is conjugate with the normal likelihood when its mean is fixed. However in practice the standard deviation can instead be given a uniform prior (or anything else that reflects prior beliefs, of course).\n", "3. The formulas for Bayesian updating of the parameter distribution are more conveniently expressed in terms of precision than standard deviation. Normal distributions are described sometimes in terms of standard deviation and sometimes in terms of precision, so it is important to glean from context which is being referred to. In R and Stan, the normal distribution is parameterized by mean and standard deviation. In JAGS and BUGS, the normal distribution is parameterized by mean and precision." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 16.1.2 Approximation by MCMC in JAGS" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is easy to estimate the mean and standard deviation in JAGS." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* The right panel of Figure 16.2 instead puts a broad uniform distribution directly\n", "on σ . The low and high values of the uniform distribution are set to be far outside any realistic\n", "value for the data, so that the prior has minimal influence on the posterior. \n", "* The uniform prior on σ is easier to intuit than a gamma prior on precision, but the priors are not equivalent." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "dataList = list(\n", " y = y ,\n", " Ntotal = length(y) ,\n", " meanY = mean(y) ,\n", " sdY = sd(y)\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "model {\n", " for ( i in 1:Ntotal ) {\n", " y[i] ̃ dnorm( mu , 1/sigmaˆ2 ) # JAGS uses precision\n", " }\n", " mu ̃ dnorm( meanY , 1/(100*sdY)ˆ2 ) # JAGS uses precision\n", " sigma ̃ dunif( sdY/1000 , sdY*1000 )\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* For purposes of illustration, we use fictitious data.\n", "* The data are \n", " - IQ (intelligence quotient) scores \n", " - from a group of people who have consumed a “smart drug.” \n", " - We know that IQ tests have been normed to the general population so that they have an average score of 100 and a standard deviation of 15. \n", "* Therefore, we would like to know how differently the smart-drug group has performed relative to the general population average." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Jags-Ymet-Xnom1grp-Mnormal-Example.R" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# 책에는 없는 코드. 실습을 위해 data 폴더에 넣은 코드를 실행시키기 위한 작업\n", "cur_dir = getwd()\n", "setwd(sprintf(\"%s/%s\", cur_dir, 'data'))" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Example for Jags-Ymet-Xnom1grp-Mnormal.R \n", "#------------------------------------------------------------------------------- \n", "# Optional generic preliminaries:\n", "#graphics.off() # This closes all of R's graphics windows.\n", "#rm(list=ls()) # Careful! This clears all of R's memory!" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#------------------------------------------------------------------------------- \n", "# Load The data file \n", "myDataFrame = read.csv( file=\"TwoGroupIQ.csv\" )" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
Score | Group | |
---|---|---|
1 | 102 | Smart Drug |
2 | 107 | Smart Drug |
3 | 92 | Smart Drug |
4 | 101 | Smart Drug |
5 | 110 | Smart Drug |
6 | 68 | Smart Drug |
7 | 119 | Smart Drug |
8 | 106 | Smart Drug |
9 | 99 | Smart Drug |
10 | 103 | Smart Drug |
11 | 90 | Smart Drug |
12 | 93 | Smart Drug |
13 | 79 | Smart Drug |
14 | 89 | Smart Drug |
15 | 137 | Smart Drug |
16 | 119 | Smart Drug |
17 | 126 | Smart Drug |
18 | 110 | Smart Drug |
19 | 71 | Smart Drug |
20 | 114 | Smart Drug |
21 | 100 | Smart Drug |
22 | 95 | Smart Drug |
23 | 91 | Smart Drug |
24 | 99 | Smart Drug |
25 | 97 | Smart Drug |
26 | 106 | Smart Drug |
27 | 106 | Smart Drug |
28 | 129 | Smart Drug |
29 | 115 | Smart Drug |
30 | 124 | Smart Drug |
31 | 137 | Smart Drug |
32 | 73 | Smart Drug |
33 | 69 | Smart Drug |
34 | 95 | Smart Drug |
35 | 102 | Smart Drug |
36 | 116 | Smart Drug |
37 | 111 | Smart Drug |
38 | 134 | Smart Drug |
39 | 102 | Smart Drug |
40 | 110 | Smart Drug |
41 | 139 | Smart Drug |
42 | 112 | Smart Drug |
43 | 122 | Smart Drug |
44 | 84 | Smart Drug |
45 | 129 | Smart Drug |
46 | 112 | Smart Drug |
47 | 127 | Smart Drug |
48 | 106 | Smart Drug |
49 | 113 | Smart Drug |
50 | 109 | Smart Drug |
51 | 208 | Smart Drug |
52 | 114 | Smart Drug |
53 | 107 | Smart Drug |
54 | 50 | Smart Drug |
55 | 169 | Smart Drug |
56 | 133 | Smart Drug |
57 | 50 | Smart Drug |
58 | 97 | Smart Drug |
59 | 139 | Smart Drug |
60 | 72 | Smart Drug |
61 | 100 | Smart Drug |
62 | 144 | Smart Drug |
63 | 112 | Smart Drug |
64 | 109 | Placebo |
65 | 98 | Placebo |
66 | 106 | Placebo |
67 | 101 | Placebo |
68 | 100 | Placebo |
69 | 111 | Placebo |
70 | 117 | Placebo |
71 | 104 | Placebo |
72 | 106 | Placebo |
73 | 89 | Placebo |
74 | 84 | Placebo |
75 | 88 | Placebo |
76 | 94 | Placebo |
77 | 78 | Placebo |
78 | 108 | Placebo |
79 | 102 | Placebo |
80 | 95 | Placebo |
81 | 99 | Placebo |
82 | 90 | Placebo |
83 | 116 | Placebo |
84 | 97 | Placebo |
85 | 107 | Placebo |
86 | 102 | Placebo |
87 | 91 | Placebo |
88 | 94 | Placebo |
89 | 95 | Placebo |
90 | 86 | Placebo |
91 | 108 | Placebo |
92 | 115 | Placebo |
93 | 108 | Placebo |
94 | 88 | Placebo |
95 | 102 | Placebo |
96 | 102 | Placebo |
97 | 120 | Placebo |
98 | 112 | Placebo |
99 | 100 | Placebo |
100 | 105 | Placebo |
101 | 105 | Placebo |
102 | 88 | Placebo |
103 | 82 | Placebo |
104 | 111 | Placebo |
105 | 96 | Placebo |
106 | 92 | Placebo |
107 | 109 | Placebo |
108 | 91 | Placebo |
109 | 92 | Placebo |
110 | 123 | Placebo |
111 | 61 | Placebo |
112 | 59 | Placebo |
113 | 105 | Placebo |
114 | 184 | Placebo |
115 | 82 | Placebo |
116 | 138 | Placebo |
117 | 99 | Placebo |
118 | 93 | Placebo |
119 | 93 | Placebo |
120 | 72 | Placebo |