In each chapter, the last section contains some questions for you to complete to make sure you understood the material. You can download the code to answer questions 1.1 to 1.5 below at http://www.math.montana.edu/courses/s217/documents/Ch1.Rmd. But to practice learning R, it would be most useful for you to try to accomplish the requested tasks yourself and then only refer to the provided R code if/when you struggle. These questions provide a great venue to check your learning, often to see the methods applied to another data set, and for something to discuss in study groups, with your instructor, and at the Math Learning Center.
1.1. Read in the treadmill data set
discussed previously and find the mean and SD of the Ages (Age
variable) and Body
Weights (BodyWeight
variable). In studies involving human subjects, it is
common to report a
summary of characteristics of the subjects. Why does this matter? Think about
how your interpretation of any study of the fitness of subjects would change if
the mean age (same spread) had been 20 years older or 35 years younger.
1.2. How does knowing about the distribution of results for Age and BodyWeight help you understand the results for the Run Times discussed previously?
1.3. The mean and SD are most useful as summary statistics only if the distribution is relatively symmetric. Make a histogram of Age responses and discuss the shape of the distribution (is it skewed right, skewed left, approximately symmetric?; are there outliers?). Approximately what range of ages does this study pertain to?
1.4. The weight responses are in
kilograms and you might prefer to see them in pounds. The conversion is
lbs=2.205*
kgs. Create a new variable in the treadmill
tibble called BWlb using this code:
and find the mean and SD of the new variable (BWlb).
1.5. Make histograms and boxplots of the original BodyWeight and new BWlb variables. Discuss aspects of the distributions that changed and those that remained the same with the transformation from kilograms to pounds. What does this tell you about changing the units of a variable in terms of its distribution?