--- title: "Analysis of variance" editor: markdown: wrap: 72 --- ## Packages ```{r} library(tidyverse) library(smmr) library(PMCMRplus) ``` ## Jumping rats - Link between exercise and healthy bones (many studies). - Exercise stresses bones and causes them to get stronger. - Study (Purdue): effect of jumping on bone density of growing rats. - 30 rats, randomly assigned to 1 of 3 treatments: - No jumping (control) - Low-jump treatment (30 cm) - High-jump treatment (60 cm) - 8 weeks, 10 jumps/day, 5 days/week. - Bone density of rats (mg/cm$^3$) measured at end. ## Jumping rats 2/2 - See whether larger amount of exercise (jumping) went with higher bone density. - Random assignment: rats in each group similar in all important ways. - So entitled to draw conclusions about cause and effect. ## Reading the data Values separated by spaces: ```{r inference-5-R-9} my_url <- "http://ritsokiguess.site/datafiles/jumping.txt" rats <- read_delim(my_url," ") ``` ## The data (some random rows) ```{r inference-5-R-10} rats %>% slice_sample(n=10) rats ``` ## Boxplots ```{r inference-5-R-11, fig.height=3.7} ggplot(rats, aes(y=density, x=group)) + geom_boxplot() ``` ## Or, arranging groups in data (logical) order ```{r inference-5-R-12, fig.height=3.5} ggplot(rats, aes(y=density, x=fct_inorder(group))) + geom_boxplot() ``` ## Analysis of Variance - Comparing \> 2 groups of independent observations (each rat only does one amount of jumping). - Standard procedure: analysis of variance (ANOVA). - Null hypothesis: all groups have same mean. - Alternative: "not all means the same", at least one is different from others. ## Testing: ANOVA in R ```{r inference-5-R-13} rats.aov <- aov(density~group,data=rats) summary(rats.aov) ``` - Usual ANOVA table, small P-value: significant result. - Conclude that the mean bone densities are not all equal. - Reject null, but not very useful finding. ## Which groups are different from which? - ANOVA really only answers half our questions: it says "there are differences", but doesn't tell us which groups different. - One possibility (not the best): compare all possible pairs of groups, via two-sample t. - First pick out each group: ```{r inference-5-R-14} rats %>% filter(group=="Control") -> controls rats %>% filter(group=="Lowjump") -> lows rats %>% filter(group=="Highjump") -> highs ``` ## Control vs. low ```{r inference-5-R-15} t.test(controls$density, lows$density) ``` No sig. difference here. ## Control vs. high ```{r inference-5-R-16} t.test(controls$density, highs$density) ``` These are different. ## Low vs. high ```{r inference-5-R-17} t.test(lows$density, highs$density) ``` These are different too. ## But... - We just did 3 tests instead of 1. - So we have given ourselves 3 chances to reject $H_0:$ all means equal, instead of 1. - Thus $\alpha$ for this combined test is not 0.05. ## John W. Tukey ::: columns ::: {.column width="40%"} ![](John_Tukey.jpg){width="400"} ::: ::: {.column width="60%"} - American statistician, 1915--2000 - Big fan of exploratory data analysis - Popularized boxplot - Invented "honestly significant differences" - Invented jackknife estimation - Coined computing term "bit" - Co-inventor of Fast Fourier Transform ::: ::: ## Honestly Significant Differences - Compare several groups with one test, telling you which groups differ from which. - Idea: if all population means equal, find distribution of highest sample mean minus lowest sample mean. - Any means unusually different compared to that declared significantly different. ## Tukey on rat data ```{r inference-5-R-18, echo=F} width <- getOption("width") options(width = 60) ``` ```{r inference-5-R-19} rats.aov <- aov(density~group, data = rats) summary(rats.aov) TukeyHSD(rats.aov) ``` ```{r inference-5-R-20, echo=F} options(width=width) ``` - Again conclude that bone density for highjump group significantly higher than for other two groups. ## Why Tukey's procedure better than all t-tests Look at P-values for the two tests: ``` Comparison Tukey t-tests ---------------------------------- Highjump-Control 0.0016 0.0021 Lowjump-Control 0.4744 0.2977 Lowjump-Highjump 0.0298 0.0045 ``` - Tukey P-values (mostly) higher. - Proper adjustment for doing three t-tests at once, not just one in isolation. ## Checking assumptions ```{r inference-5-R-21} #| fig.height = 4 ggplot(rats,aes(y = density, x = fct_inorder(group)))+ geom_boxplot() ``` Assumptions: - Normally distributed data within each group - with equal group SDs. ## Normal quantile plots by group ```{r inference-5-R-22, fig.height=3.5} ggplot(rats, aes(sample = density)) + stat_qq() + stat_qq_line() + facet_wrap( ~ group) ``` ## The assumptions - Normally-distributed data within each group - Equal group SDs. - These are shaky here because: - control group has outliers - highjump group appears to have less spread than others. - Possible remedies (in general): - Transformation of response (usually works best when SD increases with mean) - If normality OK but equal spreads not, can use Welch ANOVA. (Regular ANOVA like pooled t-test; Welch ANOVA like Welch-Satterthwaite t-test.) - Can also use Mood's Median Test (see over). This works for any number of groups. ## Mood's median test here - Find median of all bone densities, regardless of group - Count up how many observations in each group above or below overall median - Test association between group and being above/below overall median, using chi-squared test. - Actually do this using `median_test`: ```{r inference-5-R-27} median_test(rats, density, group) ``` ## Comments - No doubt that medians differ between groups (not all same). - This test is equivalent of $F$-test, not of Tukey. - To determine which groups differ from which, can compare all possible pairs of groups via (2-sample) Mood's median tests, then adjust P-values by multiplying by number of 2-sample Mood tests done (Bonferroni): ```{r inference-5-R-28} pairwise_median_test(rats, density, group) ``` - Now, lowjump-highjump difference no longer significant. ## Welch ANOVA - For these data, Mood's median test probably best because we doubt both normality and equal spreads. - When normality OK but spreads differ, Welch ANOVA way to go. - Welch ANOVA done by `oneway.test` as shown (for illustration): ```{r inference-5-R-29} oneway.test(density~group, data=rats) ``` - P-value very similar, as expected. - Appropriate Tukey-equivalent here called Games-Howell. ## Games-Howell - Lives in package `PMCMRplus`. Install first. ```{r games-howell, warning=F} gamesHowellTest(density ~ factor(group), data = rats) ``` ## Deciding which test to do For two or more samples: ![](testflow.png)