Most of this intro material was borrowed from an excellent online and open text by Chester Ismay and Albert Kim: ModernDive. Sincere thanks to them for making their material available.
Before we install and start using R and RStudio, there are some key concepts to understand first:
It is assumed that you are using R via RStudio. First time users often confuse the two. At its simplest:
R: Engine | RStudio: Dashboard |
---|---|
More precisely, R is a programming language that runs computations while RStudio is an integrated development environment (IDE) that provides an interface by adding many convenient features and tools. So the way of having access to a speedometer, rearview mirrors, and a navigation system makes driving much easier, using RStudio’s interface makes using R much easier as well.
NOTE: R needs to be installed successfully prior to installing RStudio (because the latter depends on the former)
Figure out what operating system (and version) you have on your computer (e.g. Windows 10; Mac OS X 10.9 “Mavericks”)
Now to install RStudio: once you have installed “R”, go to this website and click on the green “download” button underneath the “RStudio desktop FREE” column. Then, click on the appropriate link under the “Installers for Supported Platforms” heading; this again depends on what platform / operating system you’re using. It will likely be one of the first two options.
Launch RStudio on your computer to make sure it’s working (it loads R for you in the background).
Recall our car analogy from above. Much as we don’t drive a car by interacting directly with the engine but rather by using elements on the car’s dashboard, we won’t be using R directly but rather we will use RStudio’s interface. After you install R and RStudio on your computer, you’ll have two new programs AKA applications you can open. We will always work in RStudio and not R. In other words:
R: Do not open this | RStudio: Open this |
---|---|
When you open RStudio, you should see the following:
Now that you’re set up with R and RStudio, you are probably asking yourself “OK. Now how do I use R?” The first thing to note as that unlike other software, such as Excel, which provide point and click interfaces, R is an interpreted language, meaning you have to enter in R commands written in R code i.e. you have to program in R (the terms “coding” and “programming” are used interchangeably).
While it is not required to be a seasoned coder/computer programmer to use R, there is still a set of basic programming concepts that R users need to understand. You will need to learn just enough of these basic programming concepts to explore and analyze data effectively.
It is important to note that while the following materials serve as excellent introductions, a single pass through them is insufficient for long-term learning and retention. The ultimate tools for long-term learning and retention are “learning by doing” and repetition.
This is the bare minimum you need to know before you get started; the rest you can learn as you go. Remember that your knowledge of all of these concepts will build as you get better and better at “speaking R” and getting used to its syntax.
Learning to code/program is very much like learning a foreign language, it can be very daunting and frustrating at first. However just as with learning a foreign language, if you put in the effort and are not afraid to make mistakes, anybody can learn. Lastly, there are a few useful things to keep in mind as you learn to program:
An R package is a collection of functions, data, and documentation that extends the capabilities of R. They are written by a world-wide community of R users. For example, among the most popular packages are:
ggplot2
package for data visualizationdplyr
package for data wrangling (manipulation)However, there are two key things to remember about R packages:
Installation: Most packages are not installed by default when you install R and RStudio. You need to install a package before you can use it. Once you’ve installed it, you likely don’t need to install it again unless you want to update it to a newer version of the package.
Loading: Packages are not loaded automatically when you open RStudio. You need to load them everytime you open RStudio.
ggplot2
and dplyr
packages.There are two ways to install an R package. For example, to install the ggplot2
package:
ggplot2
install.packages("ggplot2")
install.packages("dplyr")
Note: When working on your own computer, you only have to install a package once, unless you want to update an already installed package to the latest version. If you want to update a package to the latest version, then re-install it by repeating the above steps.
HOWEVER: If you’re working on a school computer (in a lab), you may need to install packages each session. Packages that are listed in the “packages” pane when you start up RStudio are pre-installed, and therefore don’t need to be re-installed each time.
ggplot2
and dplyr
packages.After you’ve installed a package, you can now load it using the library()
command. For example, to load the ggplot2
and dplyr
packages, run the following code in the Console pane:
library(ggplot2)
library(dplyr)
Note: You have to reload each package you want to use every time you open a new session of RStudio. This is a little annoying to get used to and will be your most common error as you begin. When you see an error such as
Error: could not find function
remember that this likely comes from you trying to use a function in a package that has not been loaded. Remember to run the library()
function with the appropriate package to fix this error.
AND: If you’re working on a school computer (in a lab), you may need to install packages each session, then use the “library” command to load the package. Packages that are listed in the “packages” pane when you start up RStudio are pre-installed, and therefore don’t need to be re-installed each time.
TIP: You could include R chunks at the top of your R Markdown template that load commonly used packages.
That was a short introduction to R and RStudio, but we will provide you with more functions and a more complete sense of the language as the course progresses.
Below we provide tables with various online resources, including ones that will help you troubleshoot.
If you are googling for R code, make sure to also include package names in your search query (if you are using a specific package). For example, instead of googling “scatterplot in R”, google “scatterplot in R with ggplot2”.
Rstudio provides links to several cheatsheets that will come in handy throughout the semester.
You can get nice PDF versions of the files by going to Help -> Cheatsheets inside RStudio:
The book titled “Getting used to R, RStudio, and R Markdown” by Chester Ismay, which can be freely accessed here, is also a wonderful resource for for new users of R, RStudio, and R Markdown. It includes examples showing working with R Markdown files in RStudio recorded as GIFs.
URL Purpose |
https://r-dir.com/learn/tutorials.html List of useful tutorials including videos
https://www.rstudio.com/online-learning/#R Various online learning materials at RStudio
http://qcbs.ca/wiki/r Great workshop material including stats
http://cyclismo.org/tutorial/R/index.html Good Intro R workshop
http://r4ds.had.co.nz/index.html Hadley Wickham’s online book
Table: Learning R - Tutorials and workshop materials |
____ |
URL | Purpose |
---|---|
http://blog.revolutionanalytics.com/beginner-tips/ | How to find packages |
Table: So many packages - how to find useful ones?
URL | Purpose |
---|---|
https://www.zoology.ubc.ca/~schluter/R/ | UBC zoology site handy stats reference |
http://statmethods.net/ | Good general reference |
http://onlinelibrary.wiley.com/book/10.1002/9781444319620 | Great biostats / R reference (UBC online access) |
http://www.simonqueenborough.info/R/statistics/index.html | Simon Queensborough tutorials |
http://ww2.coastal.edu/kingw/statistics/R-tutorials/index.html | A variety of stats material |
http://stats.idre.ucla.edu/r/dae/ | R codes for stats, with comparison to other software (e.g. SAS) |
URL | Purpose |
---|---|
http://statmethods.net/ | Good general reference for graphing |
http://ggplot2.org/book/ | Hadley Wickhams’s ggplot2 book online |
http://stats.idre.ucla.edu/r/seminars/ggplot2_intro/ | Great new tutorial on ggplot2 |
http://www.cookbook-r.com/Graphs/ | Graphing with ggplot2 |
URL | Purpose |
---|---|
https://www.r-bloggers.com/ | Popular blog site, lots of info |
http://rseek.org/ | Search engine for R help |
https://blog.rstudio.org/ | Main RStudio blog |
URL | Purpose |
---|---|
https://cos.io/ | Center for Open Science |
https://ropensci.org/ | Open science R resources |
http://geoscripting-wur.github.io/RProjectManagement/ | Lessons about version control |
https://nicercode.github.io/ | Helping to improve your code |
http://www.geo.uzh.ch/microsite/reproducible_research/post/rr-rstudio-git/ | How to use Git with RStudio |
URL | Purpose |
---|---|
http://spatial.ly/r/ | Example maps with R |
https://rstudio.github.io/leaflet/ | Online mapping with R |
https://www.earthdatascience.org/courses/earth-analytics/ | Excellent course |
https://geoscripting-wur.github.io/ | Amazing “geoscripting” tutorial |
https://r-arcgis.github.io/ | Tools to link ArcGIS with R |
You may feel somewhat overwhelmed at this point, but don’t worry - that’s understandable! What you have learned so far will enable you to tackle the first few labs. Remember that the best way to learn R is to use it a lot!