--- title: "Hierarchical Models" output: pdf_document: default --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = F) library(tidyverse) library(arm) library(knitr) library(kableExtra) library(lme4) library(rstan) library(rstanarm) options(mc.cores=parallel::detectCores()) ``` ### Motivating Dataset Recall the housing dataset from King County, WA that contains sales prices of homes across the Seattle area. \vfill ```{r, message = F} seattle <- read_csv("http://math.montana.edu/ahoegh/teaching/stat408/datasets/SeattleHousing.csv") seattle <- seattle %>% mutate(zipcode = factor(zipcode), sqft_living_sq = sqft_living ^2, sqft1000 = sqft_living / 1000, price100000 = price / 100000, scale_sqft = scale(sqft_living)) ``` ```{r, echo = F} seattle %>% ggplot(aes(y = price100000, x = sqft1000, color = zipcode)) + geom_point() + geom_smooth(method = 'lm', formula = 'y~x') + theme_bw() + facet_wrap(.~zipcode) + theme(legend.position = "none") + ggtitle('Housing price vs. Living Square Feet in King County, WA') + ylab('Sales Price ($100,000)') + xlab('Living Space (1000 sqft)') + theme(axis.text.x = element_text(angle = 270, hjust = 1)) ``` \vfill ### Multilevel models While we will initially just look at a model with the varying intercepts, \vfill There are several different, but equivalent specifications in GH 12.5, but here is one way to look at the model. ### lmer One common approach for hierarchical models is to use the `lmer` function in the `lme4` package. Note that the hierarchical structure we have detailed can also be applied to GLMs using `glmer`. Note that most of this code (and the textbook) is "pre-rstanarm", so it might be more intuitive to use `stan_glmer`, which we will also look at a Bayesian version in a little bit using `stan_glmer`. \vfill ```{r, echo = T} lmer1 <- lmer(price ~ (1 | zipcode) , data = seattle) display(lmer1) coef(lmer1) ``` Note the coefficients for a specific group are defined as the fixed effect + the random effect. ```{r, echo = T} fixef(lmer1) ``` The fixed effect here corresponds to $\mu_\alpha$. The standard component associated with the random effect can also be extracted. ```{r, echo = T} sigma.hat(lmer1)$sigma$zipcode ``` \newpage ```{r, echo = T} ranef(lmer1) ``` #### Summarizing the model ```{r, echo = T} fixed_ci <- round(fixef(lmer1)['(Intercept)'] + c(-2,2) * se.fixef(lmer1)['(Intercept)']) ``` The 95% interval for the fixed effects intercept is (`r prettyNum(fixed_ci[1],big.mark=",",scientific=FALSE)`, `r prettyNum(fixed_ci[2],big.mark=",",scientific=FALSE)`). This can be interpreted as the overall mean price of a house. Formally, this is more the mean of the group means. \vfill The 95% intervals for the group effects (or deviations from the mean price) are: ```{r} tibble(zipcode = rownames(ranef(lmer1)$zipcode), lower = round(ranef(lmer1)$zipcode + -2 * se.ranef(lmer1)$zipcode), upper = round(ranef(lmer1)$zipcode + 2 * se.ranef(lmer1)$zipcode)) %>% kable(format.args = list(big.mark = ",")) ``` \vfill A more useful way to summarize the data would be to create 95% intervals for the overall intercept \vfill ```{r, echo = T} samples <- arm::sim(lmer1, n.sims = 1000) overall <- fixef(samples) group <- matrix(ranef(samples)$zipcode[,,1], nrow = 1000, ncol = ngrps(lmer1), byrow = F) group_totals <- group + matrix(overall, nrow = 1000, ncol = ngrps(lmer1)) ``` \newpage ```{r} group_int <- apply(group_totals, 2, quantile, probs = c(.025,.975) ) tibble(zipcode = rownames(ranef(lmer1)$zipcode), lower =group_int[1,], upper = group_int[2,]) %>% ggplot(aes(x = lower, xend = upper, y = zipcode, yend = zipcode, color = zipcode)) + theme_bw() + ggtitle('Mean housing price from multilevel model') + xlab('Closing Price (USD)') + scale_x_continuous(breaks = c(500000, 1000000, 2000000), label = c("500k", "1 million", "2 million"), limits = c(0, 3000000)) + geom_segment() + annotate('text', 2180000, 4.5, label ="Medina, WA") + geom_point(inherit.aes = F, aes(y = zipcode, x = price, color = zipcode), data = seattle, size = .2) + geom_segment(color = 'black') + labs(caption = "note: black bars represent confidence interval for mean price \n dots represent individual houses, where those more expensive than $3 million are excluded ") + theme(legend.position = "none") ``` ### Prediction Note the previous figure contains uncertainty for the mean price within a particular zipcode. Similar to before you could also make predictions for a new home in an existing dataset. \vfill ```{r, echo = T} sigma_alpha <- sigma.hat(lmer1)$sigma$zipcode mu_alpha <- fixef(lmer1)["(Intercept)"] rnorm(10, mu_alpha, sigma_alpha) alpha_samples <- rnorm(1000, mu_alpha, sigma_alpha) ``` \vfill ```{r, echo = T} sigma_y <- sigma.hat(lmer1)$sigma$data new_zip <- rnorm(1000, mean = alpha_samples, sd = sigma_y) ``` \newpage ```{r} tibble(price = new_zip) %>% ggplot(aes(x = price)) + geom_histogram(bins = 50) + theme_bw() + ggtitle("Estimated price distribution for a new zipcode in King County, WA") ``` #### Adding Coefficients The model we have just outlined does not include any additional covariates. \vfill ```{r, echo = T} lmer2 <- lmer(price ~ scale_sqft + (1 |zipcode), data = seattle) display(lmer2) ``` \newpage \vfill \vfill Note: you may have to adjust the REML and optimizer options to achieve convergence ```{r, echo = T, warning = T} lmer_nonconverge <- lmer(price ~ scale_sqft + (1 + scale_sqft|zipcode), data = seattle) ``` ```{r, echo = T} lmer3 <- lmer(price ~ scale_sqft + (1 + scale_sqft|zipcode), data = seattle, REML = FALSE) display(lmer3) ``` \vfill The fixed-effects or means of the group-level effects can be extracted. ```{r, echo = T} fixef(lmer3) ``` Similarly, the variance of those group-level effects can also be obtained from the model. ```{r, echo = T} sigma.hat(lmer3)$sigma ``` \newpage ## stan_glmer Similar to how we have used `stan_glm()`, we can also use `stan_glmer()` to fit these models. ```{r, echo = T} stan_lmer1 <- stan_glmer(price ~ (1 | zipcode) , data = seattle) ``` \vfill ```{r, echo = T} print(stan_lmer1) display(lmer1) ``` \newpage ```{r, echo = T} coef(stan_lmer1) coef(lmer1) ``` \newpage ```{r} summary(stan_lmer1, probs = c(0.025, 0.975)) ``` \newpage We can also directly extract the simulations from the stan object. ```{r} sims <- as.matrix(stan_lmer1) head(sims) ``` This can be used for generating predictions and credible intervals. \newpage ##### Final Connections __Group-level covariates__: \vfill __Interactions:__ \vfill __Shrinkage:__ \vfill $$\hat{\alpha}_j \approx \frac{\frac{n_j}{\sigma_y^2}\bar{y}_j + \frac{1}{\sigma^2_\alpha }\bar{y}_{all}}{\frac{n_j}{\sigma_y^2} + \frac{1}{\sigma^2_\alpha }}$$ where $\sigma^2_y$ is the variance of the data and $\sigma^2_\alpha$ is the variance of the group-level averages. \vfill __Selection of Random Effects:__ these varying effect models necessarily impose additional complexity on our modeling framework; however, GH suggest embracing the complexity (as it often helps directly answer research questions), moreover, they don't recommend using evidence statements to select specific random effects.