This page was last updated on September 11, 2019.


Getting started

Load the required packages

Load the required tigerstats package:


Describing a numerical variable grouped by categories

When we are visualizing how a numeric response variable is associated with a categorical explanatory variable, it is advisable to calculate descriptive statistics for the response variable grouped by category. For instance, what is the mean and standard deviation of height for each of the sexes?

We can do this using the favstats function:

##   sex   min     Q1 median  Q3   max     mean       sd  n missing
## 1   f 150.0 161.25  165.0 170 186.0 165.8932 6.704743 90       0
## 2   m 166.4 176.50  180.3 183 210.8 180.5184 7.327292 64       0

TIP: Note the syntax, with the response variable on the left of the ~ symbol, and the the explanatory (categorical) variable on the right.

The favstats function provides all of the descriptive statistics you’d typically want to report, and more than you need. However, you can use the other functions we learned in a similar way:

##        f        m 
## 165.8932 180.5184

List of functions (and the source packages) used in tutorial

Getting started:

  • read.csv
  • url
  • library

Descriptive stats:

  • favtstats
  • mean