14.4 Getting additional information from ANOVA objects

You can get a lot of interesting information from ANOVA objects. To see everything that’s stored in one, run the command on an ANOVA object. For example, here’s what’s in our last ANOVA object:

# Show me what's in my aov object
names(cleaner.type.int.aov)
##  [1] "coefficients"  "residuals"     "effects"       "rank"          "fitted.values" "assign"        "qr"           
##  [8] "df.residual"   "contrasts"     "xlevels"       "call"          "terms"         "model"

For example, the "fitted.values" contains the model fits for the dependent variable (time) for every observation in our dataset. We can add these fits back to the dataset with the $ operator and assignment. For example, let’s get the model fitted values from both the interaction model (cleaner.type.aov) and the non-interaction model (cleaner.type.int.aov) and assign them to new columns in the dataframe:

# Add the fits for the interaction model to the dataframe as int.fit

poopdeck$int.fit <- cleaner.type.int.aov$fitted.values

# Add the fits for the main effects model to the dataframe as me.fit

poopdeck$me.fit <- cleaner.type.aov$fitted.values

Now let’s look at the first few rows in the table to see the fits for the first few observations.

head(poopdeck)
##   day cleaner   type time int.fit me.fit
## 1   1       a parrot   47      46     54
## 2   1       b parrot   55      54     54
## 3   1       c parrot   64      56     47
## 4   1       a  shark  101      86     78
## 5   1       b  shark   76      77     77
## 6   1       c  shark   63      62     71

You can use these fits to see how well (or poorly) the model(s) were able to fit the data. For example, we can calculate how far each model’s fits were from the true data as follows:

# How far were the interaction model fits from the data on average?

mean(abs(poopdeck$int.fit - poopdeck$time))
## [1] 15


# How far were the main effect model fits from the data on average?

mean(abs(poopdeck$me.fit - poopdeck$time))
## [1] 17

As you can see, the interaction model was off from the data by 15.35 minutes on average, while the main effects model was off from the data by 16.54 on average. This is not surprising as the interaction model is more complex than the main effects only model. However, just because the interaction model is better at fitting the data doesn’t necessarily mean that the interaction is either meaningful or reliable.