Launch RStudio.
Notice the default panes:
Go into the Console, where we interact with the live R process.
Make an assignment and then inspect the object you just created.
x <- 3 * 4
x
## [1] 12
All R statements where you create objects (assignments) have this form:
objectName <- value
You will make lots of assignments and the operator
<-
is a pain to type. Don’t be lazy and use=
, although it would work, because it will just sow confusion later. Instead, utilize RStudio’s keyboard shortcut: Alt/Option + - (the minus sign).
<-
with spaces, which demonstrates a useful code formatting practice. Code is miserable to read on a good day. Give your eyes a break and use spaces.Object names cannot start with a digit and cannot contain certain other characters such as a comma or a space. You will be wise to adopt a convention for demarcating words in names.
i_use_snake_case
other.people.use.periods
evenOthersUseCamelCase
Make another assignment:
this_is_a_really_long_name <- 2.5
To inspect this, try out RStudio’s completion facility: type the first few characters, press TAB, add characters until you disambiguate, then press return.
Make another assignment:
science_rules <- 2 ^ 3
Let’s try to inspect:
sciencerules
## Error in eval(expr, envir, enclos): object 'sciencerules' not found
sceince_rules
## Error in eval(expr, envir, enclos): object 'sceince_rules' not found
Implicit contract with the computer / scripting language: Computer will do tedious computation for you. In return, you will be completely precise in your instructions. Typos matter. Case matters. Get better at typing.
R has a mind-blowing collection of built-in functions that are accessed like so:
functionName(arg1 = val1, arg2 = val2, and so on)
Let’s try using seq()
which makes regular sequences of numbers and, while we’re at it, demo more helpful features of RStudio.
Type se
in the console and hit TAB. A pop up shows you possible completions. Specify seq()
by typing more to disambiguate or using the up/down arrows to select. Notice the floating tool-tip-type help that pops up, reminding you of a function’s arguments. If you want even more help, press F1 as directed to get the full documentation in the help tab of the lower right pane. Now open the parentheses and notice the automatic addition of the closing parenthesis and the placement of cursor in the middle. Type the arguments 1, 10
and hit return. RStudio also exits the parenthetical expression for you. IDEs are great.
seq(1, 10)
## [1] 1 2 3 4 5 6 7 8 9 10
The above also demonstrates something about how R resolves function arguments. You can always specify in name = value
form. But if you do not, R attempts to resolve by position. So above, it is assumed that we want a sequence from = 1
that goes to = 10
. Since we didn’t specify step size, the default value of by
in the function definition is used, which ends up being 1 in this case. Be careful relying on this feature too often. If you use a function where you’re specifying more than 3 arguments, you might want to use name = value
to avoid any later confusion.
Make this assignment and notice similar help with quotation marks.
yo <- "hello world"
If you just make an assignment, you don’t get to see the value, so then you’re tempted to immediately inspect.
y <- seq(1, 10)
y
## [1] 1 2 3 4 5 6 7 8 9 10
This common action can be shortened by surrounding the assignment with parentheses, which causes assignment and “print to screen” to happen.
(y <- seq(1, 10))
## [1] 1 2 3 4 5 6 7 8 9 10
Not all functions have (or require) arguments:
date()
## [1] "Tue Jan 3 12:56:45 2017"
Now look at your workspace – in the upper right pane. The workspace is where user-defined objects accumulate. You can also get a listing of these objects with commands:
objects()
## [1] "science_rules" "this_is_a_really_long_name"
## [3] "x" "y"
## [5] "yo"
ls()
## [1] "science_rules" "this_is_a_really_long_name"
## [3] "x" "y"
## [5] "yo"
If you want to remove the object named y
, you can do this
rm(y)
To remove everything:
rm(list = ls())
or click the broom in RStudio’s Environment pane.
.R
file (called an R script)For more information on R Markdown, read the R for Data Science chapter and read this short series of lessons on R Markdown.
Go to R for Data Science and read chapter 8 on workflows and R projects. Do not skip this chapter. It is brief, but contains lots of important information on:
If you don’t know what those terms refer to, go back and read the chapter. Right now.
Prove you understand projects by doing the following:
my_project
. Store it wherever you like on your computer, but you should quickly adopt some sort of directory structure for all your class-related projects.Create a new R script called diamonds.R
. Add the following code to the script, save it, and run the script:
library(tidyverse)
ggplot(diamonds, aes(carat, price)) +
geom_hex()
ggsave("diamonds.pdf")
write_csv(diamonds, "diamonds.csv")
.Rproj
file as well as a PDF image and a csv file containing the diamonds
dataset.Re-open RStudio by double-clicking on the .Rproj
file. You should immediately see your previous working space containing the script and history. However your environment should be completely clean.
.R
or .r
suffix. Follow this convention unless you have some extraordinary reason not to.#
symbols. Use them. RStudio helps you (de)comment selected lines with Ctrl+Shift+C (windows and linux) or Command+Shift+C (mac).rm(list = ls())
will do this. When troubleshooting code, it is a good idea to do this, restart R (available from the Session menu), then re-run your analysis to truly check that the code you’re saving is complete and correct (or at least rule out obvious problems!)..R
files, will lead to the relevant script). You don’t know how to do this yet, but you will learn very soon..gitignore
By default, Git tracks all directories and files in your repository. Sometimes you may not want it to track everything. For instance, if you store a private API key or personally-identifiable data, you won’t want these files tracked by Git. If you did, when you push your repository to GitHub your private files will be shared with the world.
You could just store all of these files outside your repository, but that’s a pain and inconvenient. Instead, you can create a .gitignore
file in your repository. This is a special file Git uses to determine what files it should ignore. Any file listed in .gitignore
will not be tracked by Git.
When you create a new repository in GitHub (as opposed to forking an existing one), you have the option to add a template .gitignore
file depending on what programming language you will use. For example, the default .gitignore
file for R is
# History files
.Rhistory
.Rapp.history
# Session Data files
.RData
# Example code in package build process
*-Ex.R
# Output files from R CMD build
/*.tar.gz
# Output files from R CMD check
/*.Rcheck/
# RStudio files
.Rproj.user/
# produced vignettes
vignettes/*.html
vignettes/*.pdf
# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth
# knitr and R markdown default cache directories
/*_cache/
/cache/
# Temporary files created by R markdown
*.utf8.md
*.knit.md
.Rproj.user
Most of these files are not sensitive, but are merely temporary work files that you don’t need to save and track using version control. You can specify files and directories by their full name, a partial name, or file extension. Starting with homework 2 I will always include a .gitignore
in the repository, but for your own projects you will need to create these files as you find necessary.
Make sure whenever you clone a homework repository, use the url for the forked version, not the master repository. So for the first homework, I would use https://github.com/bensoltoff/hw01
when I clone the repo, not https://github.com/uc-cfss/hw01
. If you use the master repo url, you will get an error when you try to push your changes to GitHub.
For an example, let’s say I wanted to make a contribution to ggplot2
. I should fork the repo and clone the fork. Instead I goofed and cloned the original repo. When I try to push my change, I get an error message:
remote: Permission to hadley/ggplot2.git denied to bensoltoff.
fatal: unable to access 'https://github.com/hadley/ggplot2.git/': The requested URL returned error: 403
I don’t have permission to edit the master repo on Hadley Wickham’s account.
How do I fix this? I could go back and clone the correct fork, but if I’ve already made several commits then I’ll lose all my work. Instead, I can change the upstream
url: this changes the location Git tries to push my changes. To do this:
cd
command)List your existing remotes in order to get the name of the remote you want to change.
git remote -v
origin https://github.com/hadley/ggplot2.git (fetch)
origin https://github.com/hadley/ggplot2.git (push)
Change your remote’s URL to the fork with the git remote set-url
command.
git remote set-url origin
Verify that the remote URL has changed.
git remote -v
origin https://github.com/bensoltoff/ggplot2 (fetch)
origin https://github.com/bensoltoff/ggplot2 (push)
Now I can push successfully to my fork, then submit a pull request.
Make sure to use the proper program when entering the shell. For Mac users, that is Terminal. For Windows users, that is GitBash: if you followed the setup instructions properly, you should have this program on your computer. Look for it under the Start Menu > Git > GitBash. If you try to use the Command Prompt, you will run into errors because it uses different commands than GitBash.
devtools::session_info()
## Session info -------------------------------------------------------------
## setting value
## version R version 3.4.3 (2017-11-30)
## system x86_64, darwin15.6.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## tz America/Chicago
## date 2018-03-29
## Packages -----------------------------------------------------------------
## package * version date source
## assertthat 0.2.0 2017-04-11 CRAN (R 3.4.0)
## backports 1.1.2 2017-12-13 CRAN (R 3.4.3)
## base * 3.4.3 2017-12-07 local
## bindr 0.1.1 2018-03-13 CRAN (R 3.4.3)
## bindrcpp 0.2 2017-06-17 CRAN (R 3.4.0)
## broom 0.4.3 2017-11-20 CRAN (R 3.4.1)
## cellranger 1.1.0 2016-07-27 CRAN (R 3.4.0)
## cli 1.0.0 2017-11-05 CRAN (R 3.4.2)
## colorspace 1.3-2 2016-12-14 CRAN (R 3.4.0)
## compiler 3.4.3 2017-12-07 local
## crayon 1.3.4 2017-10-03 Github (gaborcsardi/crayon@b5221ab)
## datasets * 3.4.3 2017-12-07 local
## devtools 1.13.5 2018-02-18 CRAN (R 3.4.3)
## digest 0.6.15 2018-01-28 CRAN (R 3.4.3)
## dplyr * 0.7.4.9000 2017-10-03 Github (tidyverse/dplyr@1a0730a)
## evaluate 0.10.1 2017-06-24 CRAN (R 3.4.1)
## forcats * 0.3.0 2018-02-19 CRAN (R 3.4.3)
## foreign 0.8-69 2017-06-22 CRAN (R 3.4.3)
## ggplot2 * 2.2.1 2016-12-30 CRAN (R 3.4.0)
## glue 1.2.0 2017-10-29 CRAN (R 3.4.2)
## graphics * 3.4.3 2017-12-07 local
## grDevices * 3.4.3 2017-12-07 local
## grid 3.4.3 2017-12-07 local
## gtable 0.2.0 2016-02-26 CRAN (R 3.4.0)
## haven 1.1.1 2018-01-18 CRAN (R 3.4.3)
## hms 0.4.2 2018-03-10 CRAN (R 3.4.3)
## htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0)
## httr 1.3.1 2017-08-20 CRAN (R 3.4.1)
## jsonlite 1.5 2017-06-01 CRAN (R 3.4.0)
## knitr 1.20 2018-02-20 CRAN (R 3.4.3)
## lattice 0.20-35 2017-03-25 CRAN (R 3.4.3)
## lazyeval 0.2.1 2017-10-29 CRAN (R 3.4.2)
## lubridate 1.7.3 2018-02-27 CRAN (R 3.4.3)
## magrittr 1.5 2014-11-22 CRAN (R 3.4.0)
## memoise 1.1.0 2017-04-21 CRAN (R 3.4.0)
## methods * 3.4.3 2017-12-07 local
## mnormt 1.5-5 2016-10-15 CRAN (R 3.4.0)
## modelr 0.1.1 2017-08-10 local
## munsell 0.4.3 2016-02-13 CRAN (R 3.4.0)
## nlme 3.1-131.1 2018-02-16 CRAN (R 3.4.3)
## parallel 3.4.3 2017-12-07 local
## pillar 1.2.1 2018-02-27 CRAN (R 3.4.3)
## pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.0)
## plyr 1.8.4 2016-06-08 CRAN (R 3.4.0)
## psych 1.7.8 2017-09-09 CRAN (R 3.4.1)
## purrr * 0.2.4 2017-10-18 CRAN (R 3.4.2)
## R6 2.2.2 2017-06-17 CRAN (R 3.4.0)
## Rcpp 0.12.15 2018-01-20 CRAN (R 3.4.3)
## readr * 1.1.1 2017-05-16 CRAN (R 3.4.0)
## readxl 1.0.0 2017-04-18 CRAN (R 3.4.0)
## reshape2 1.4.3 2017-12-11 CRAN (R 3.4.3)
## rlang 0.2.0 2018-02-20 cran (@0.2.0)
## rmarkdown 1.9 2018-03-01 CRAN (R 3.4.3)
## rprojroot 1.3-2 2018-01-03 CRAN (R 3.4.3)
## rstudioapi 0.7 2017-09-07 CRAN (R 3.4.1)
## rvest 0.3.2 2016-06-17 CRAN (R 3.4.0)
## scales 0.5.0 2017-08-24 cran (@0.5.0)
## stats * 3.4.3 2017-12-07 local
## stringi 1.1.7 2018-03-12 CRAN (R 3.4.3)
## stringr * 1.3.0 2018-02-19 CRAN (R 3.4.3)
## tibble * 1.4.2 2018-01-22 CRAN (R 3.4.3)
## tidyr * 0.8.0 2018-01-29 CRAN (R 3.4.3)
## tidyverse * 1.2.1 2017-11-14 CRAN (R 3.4.2)
## tools 3.4.3 2017-12-07 local
## utils * 3.4.3 2017-12-07 local
## withr 2.1.1 2017-12-19 CRAN (R 3.4.3)
## xml2 1.2.0 2018-01-24 CRAN (R 3.4.3)
## yaml 2.1.18 2018-03-08 CRAN (R 3.4.4)
This work is licensed under the CC BY-NC 4.0 Creative Commons License.