This page was last updated on October 04, 2019.

It is continually being updated, so be sure to refresh the page in your browser each time you visit!

You should download copies of these helpful cheatsheets and have them on hand:

These cheat sheets deal with packages that will increasingly be used in the tutorials:


Starter Tutorials

These tutorials teach you the fundamentals of R and R Markdown, and should be completed prior to attempting any subsequent tutorials.

Introduction to R & RStudio

  • What is R and RStudio?
  • Installing R and RStudio
  • How do I code in R?
  • What are “packages”?
  • Additional resources for learning R and RStudio

Reproducible R with R Markdown Updated: Friday Sept 7, 3pm: I clarified instructions on creating new projects in RStudio

  • Creating a reproducible lab report
  • What is R Markdown?
  • Workflow: create a project and R Markdown document
  • More R Markdown information

Importing data into R Work in progress


Tutorial 00: Preparing and formatting assignments for submission

This tutorial provides instructions on how to prepare your assignments for submission.


Tutorial 01: Visualizing and describing a single variable

Before conducting any analyses, it is crucial to visualize your data with effective graphs. This can help identify potential problems with the data, including data entry errors or otherwise unusual observations to be flagged. Good quality graphs can help you describe your data effectively to your audience, and are thus crucial to effective science communication.

  • Background
    • With a single variable, we describe a frequency distribution
    • What kind of data? Categorical or Numeric?
  • Getting Started
    • Import the data
    • Load the required packages
    • Get an overview of the data
  • Frequency distributions - what are they?

  • Visualizing and describing a categorical variable
    • Creating a frequency table for one categorical variable
    • Sorting your frequency table
    • Creating a bar chart
    • Calculating descriptive statistics for one categorical variable
  • Visualizing and describing a numeric variable
    • Creating a histogram
    • Interpreting and describing histograms
    • Calculating descriptive statistics for one numeric variable
  • List of functions


Tutorial 02: Visualizing associations between two variables

  • Background
    • With two variables, we are typically interested in their association
    • What kind of data? Both variables categorical? Numeric? One of each?
  • Getting Started
    • Import the data
    • Load the required packages
    • Get an overview of the data
  • Visualizing association between two categorical variables
    • Construct a contingency table
    • Construct a grouped bar chart
    • Construct a mosaic plot
    • Interpreting grouped bar charts and mosaic plots
  • Visualizing association between two numeric variables
    • Creating a scatterplot
    • Interpreting and describing a scatterplot
  • Visualizing association between one numeric and one categorical variable
    • Construct a stripchart
    • Construct a boxplot
    • Interpreting stripcharts and boxplots
  • List of functions

Tutorial 03: Calculating descriptive statistics for a numeric variable grouped by a categorical variable

  • Getting Started
    • Import the data
    • Load the required packages
  • Describe a numeric variable grouped by categories

Tutorial 04: Sampling, Estimation, and Uncertainty

To draw reliable inferences about properties of a population of interest (e.g. What is the average height of trees in the city of Kelowna?), one requires an unbiased, representative sample to work with.

In this tutorial you will learn how to:

  • simulate random sampling from a population, to explore concepts such as sampling error
  • quantify uncertainty around parameter estimates, such as the standard error of the mean, confidence intervals
  • create and visualize sampling distributions of the mean
  • generate a vector of random numbers drawn from a normal distribution

Tutorial 05: Random trials

In this tutorial we use simulations to illustrate the concept of a “random trial”.

  • Getting Started
    • Load the required package
  • Simulate random trials
    • Rolling a 6-sided dice
    • Flipping a coin
  • Tutorial activity

Tutorial 06: Hypothesis_testing

In this tutorial we learn how simulated data can be used to test hypotheses.

This tutorial also illustrates how to approach hypothesis testing, and how to prepare your answer to questions that involve hypothesis tests.

  • Getting Started
    • Load the required package
  • A hypothesis test example
    • Steps to hypothesis testing
    • Simulating data to generate a “null distribution”
    • Calculating the P-value
    • Writing a concluding statement
  • Tutorial activity

Tutorial 07: Estimating proportions

  • Getting Started
    • Load the required packages
    • Required data
  • The Sampling Distribution for a Proportion
    • A refresher
    • Simulating the sampling distribution for a proportion
    • Visualizing the sampling distribution for a proportion
  • Properties of the sampling distribution for a proportion

  • Calculating the Standard Error for a Proportion

  • Calculating a Confidence interval for a Proportion

  • Solution to Activity 2


Tutorial 08: Binomial distribution and binomial test

NOTE: The binomial test is a type of “Goodness of Fit” test, which we learn more about in the next tutorial. In this case, we’re examining a single categorical variable that has 2 categories only, and we’re interested in whether the frequencies of observervations in the 2 categories fit our expectations based on the binomial distribution.

  • Getting Started
    • Load the required packages
    • Required data
  • The Binomial Distribution
    • Simulating a random trial
  • Binomial distribution functions in R

  • Binomial hypothesis test
    • Steps to hypothesis testing
    • Ideal concluding statement for a hypothesis test
  • Confidence interval approach to hypothesis testing

  • Solutions to Activities


Tutorial 09: Goodness of fit tests

NOTE: Here we’re examining a single categorical variable that has more than 2 categories, and we’re interested in whether the frequencies of observervations among the categories fit our expectations based on some model, such as the proportional model.

  • Getting Started
    • Install and load the required packages
    • Import the data
    • Presenting a nice table using R Markdown
    • Greek symbols in R Markdown
  • Refresher

  • A Goodness of Fit hypothesis test example
    • The research hypothesis
    • Steps to hypothesis testing
    • Stating the null and alternative statistical hypotheses
    • Visualizing the data
    • The \(\chi\)2 test statistic
    • The sampling distribution of the \(\chi\)2 test statistic
    • Calculating the expected proportions
    • Calculating the expected frequencies
    • Assumptions of the \(\chi\)2 GOF test
    • Finding the critical value of \(\chi\)2
    • Conducting the \(\chi\)2 test
    • Concluding statement
  • List of functions

  • Solutions to Activities


Tutorial 10: Odds Ratio

NOTE: Here we’re examining associations between two categorical variables that each have 2 categories. The Odds Ratio method are typically only used when analyzing health related data.

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • Visualize the data
    • Contingency table
    • Mosaic plot
  • Estimate the odds of an outcome

  • Estimate the odds ratio

  • List of functions


Tutorial 11: Contingency Analysis

NOTE: Here we’re examining associations between two categorical variables.

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • Fisher’s Exact Test on a 2 x 2 table
    • Hypothesis statement
    • Display a Contingency table
    • Display a Mosaic plot
    • Conduct the Fisher’s Exact test
    • Concluding statement
  • \(\chi\)2 Contingency Test on a m x n table
    • Hypothesis statement
    • Display a Contingency table
    • Display a Mosaic plot
    • Check assumptions
    • Get results of the test
    • Concluding statement
  • List of functions

Tutorial 12: The normal distribution

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • The Gaussian (normal) distribution

  • Generating a normal distribution

  • The central limit theorem - a simulation

  • The standard normal distribution
    • Calculating Z-scores
    • Calculating probabilities with a normal distribution
    • Calculating probabilities of sample means
    • Calculating percentiles from a normal population
  • List of functions


Tutorial 13: Comparing one mean to a hypothesized value

NOTE: Here we’re examining a single numeric variable, and comparing its mean to some expectation.

NOTE: you must also consult the Checking assumptions and data transformations tutorial.

NOTE: A tutorial covering “non-parametric” tests, which are used when assumptions of parametric tests (like the t-test), is in preparation, but will not be complete until late 2019. In the meantime, you can explore some non-parametric tests here.

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • The t distribution for sample means
    • Calculating probabilities from a t distribution
    • Finding critical values of t
    • Import the required dat
  • One-sample t-test
    • Hypothesis statement
    • Visualize the data
    • Conduct the t-test
    • Concluding statement
  • Confidence intervals for an estimate of \(\mu\)

  • Activity: Practice confidence interval

  • List of functions


Tutorial 14: Comparing two means

NOTE: Here we’re examining a single numeric variable in relation to a single categorical variable that has only 2 groups.

NOTE: you must also consult the Checking assumptions and data transformations tutorial.

NOTE: A tutorial covering “non-parametric” tests, which are used when assumptions of parametric tests (like the t-test), is in preparation, but will not be complete until late 2019. In the meantime, you can explore some non-parametric tests here.

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • Paired t-test
    • Data structure: long versus wide format
    • Hypothesis statements
    • Assumptions of the paired t-test
    • Visualize the data
    • Check assumptions
    • Conduct the test
    • Draw a conclusion
  • Two-sample t-test
    • Hypothesis statements
    • Assumptions of the two-sample t-test
    • Visualize the data
    • Check assumptions
    • Conduct the test
    • Draw a conclusion
  • When assumptions aren’t met

  • List of functions


Tutorial 15: Comparing means among more than two groups using ANOVA

NOTE: Here we’re examining a single numeric variable in relation to a single categorical variable that has more than 2 categories (groups).

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • ANOVA
    • Steps to hypothesis testing
    • Additional steps for ANOVA
    • Hypothesis statements
    • Visualize the data
    • Stripchart with error bars
    • Check assumptions
    • Table of descriptive statistics
    • Conduct the ANOVA test
    • Generate a one-way ANOVA table
    • Diversion: create a function
    • Concluding statement (Part 1)
    • Add coefficient of determination (R2) to conclusion
    • Tukey-Kramer post-hoc test
    • Visualizing the result of the post-hoc test
    • Concluding statement (Part 2)
  • List of functions

If you wish, you can download the R Markdown file that generated the tutorial here. For it to work, you also need to download a “css” file linked here (called “tutorial.css”), and place it in a directory one down from your working directory.


Tutorial 16: Correlation analysis

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • Pearson correlation analysis
    • Hypothesis testing
    • Hypothesis statements
    • Visualize the data
    • Interpreting a scatterplots
    • Assumptions of correlation analysis
    • Conduct the correlation analysis
    • Draw a conclusion
  • Rank correlation analysis - Spearman’s correlation
    • Hypothesis statements
    • Visualize the data
    • Assumptions of rank correlation analysis
    • Checking assumptions
    • Conduct the test
    • Draw a conclusion
  • List of functions

Tutorial 17: Regression analysis

  • Getting Started
    • Install and load the required packages
    • Import the required data
    • Data exploration
  • Regression analysis
    • Hypothesis testing - not really!
    • Steps to conducting regression analysis
    • Plant biomass example
    • Visualize the data
    • Interpreting a scatterplots
    • Assumptions of regression analysis
    • Transform the data if required
    • Conduct the regression analysis
    • Confidence interval for the slope
    • Scatterplot with regression confidence bands
    • Concluding statement
  • Model I versus Model II regression

  • Making predictions
    • Back-transforming regression predictions
  • List of functions

  • RMD file


Markdown files for the tutorials

You can download all the RMD files for tutorials 0 through 17 and for the Extra tutorials here. It is a “zip” file that contains:

  • 23 RMD files
  • a sub-directory called “more” that includes various images and other things that the RMD files refer to
  • a file “tutorial.css” that is used by the RMD files to define the formatting

NOTE: the zipped file containing RMD files has not been updated since spring 2019. I will re-zip newer RMD files in October 2019.

To successfully knit these RMD files yourself, you must maintain the directory structure that is provided when you unzip the ZIP file. That is, the “more” subdirectory must be in the same directory as the RMD files, and the “tutorial.css” file must be located in the same directory as the RMD files.


Extra tutorials

Checking assumptions and data transformations

  • Getting Started
    • Install and load the required packages
    • Import the required data
  • Checking the normality assumption
    • Histograms and normal quantile plots
    • Shapiro-Wilk test for normality
  • Data transformations
    • log-transform
    • Dealing with zeroes
    • log bases
    • back-transforming log data
    • logit transform
    • back-transforming logit data
    • when to back-transform?
  • List of functions

Tables in R Markdown

  • Getting Started
    • Load the required packages
    • Import the data
    • Change the class of a variable
    • Visualize the data
    • Calculate descriptive statistics
  • Creating and formatting a table
  • Extra help

Symbols in R Markdown

  • Additional resources
  • Greek symbols
  • Math notation
  • Statistics notation
  • Miscellaneous
  • Cheatsheets

Importing data into R Work in progress


Master list of functions by lab

Master function list