--- title: "STAT 302, Lecture Slides 1" subtitle: "Introduction and Getting Started with R" author: "Bryan Martin" date: "" output: xaringan::moon_reader: css: ["default", "metropolis", "metropolis-fonts", "my-theme.css"] lib_dir: libs nature: highlightStyle: tomorrow-night-bright highlightLines: true countIncrementalSlides: false titleSlideClass: ["center","top"] --- ```{r setup, include=FALSE, purl=FALSE} options(htmltools.dir.version = FALSE) knitr::opts_chunk$set(comment = "##") library(kableExtra) ``` # Outline 1. Course Overview 2. Introduction to R, RStudio, and R Markdown 3. Getting started with data .middler[**Goal:** Create a functional R Markdown document utilizing basic R functionality (Short Lab 1)] --- # Syllabus .middler[.large[ [Link to syllabus](https://bryandmartin.github.io/STAT302/syllabus.html) ]] --- # Expectations What you should expect from me... * your learning will be my priority * you will be treated like an adult and with respect * your feedback will be valued * timely feedback on assignments * understandable and well-paced lectures (tell me if they are not!) * an attempt at making statistical computation as fun for you as it is for me! --- # Expectations What I will expect from you... * regular attendance * timely assignments that represent your best work * a respectful and engaged classroom * a desire and effort to learn challenging material --- # Canvas Discussion * Worth up to 2% extra credit on your final grade! * Substantive and helpful questions and answers -- .pull-left[ ### Bad questions: * How do you do problem 2? * Here's my code and it's broken. How do I fix it? ] .pull-right[ ### Good questions: * Here's a snippet of code I used for problem 2: <br/>`formatted code snippet` <br/>It returned the following error: <br/>`formatted error message` <br/>Does anyone know why? I already tried... * I don't understand the concept from Slide 18 today. Could anyone elaborate on why... ] --- # Canvas Discussion * Worth up to 2% extra credit on your final grade! * Substantive and helpful questions and answers .pull-left[ ### Bad answers: * Here's my solution * Fool! You should already know the answer to this! Your trivial question is no match for my superior intellect! ] .pull-right[ ### Good answers: * This error message occurs because your variable is a string instead of a numeric. Have you tried checking... * I think you have a bug in line 3 of the code you posted. You have more left parentheses than right parentheses so the line is incomplete. ] --- # Why R? R is a programming language designed for statistical analysis. -- * open-source * free * large and active community of developers and users * great analysis tools * great visualization tools -- * great user interface... --- # Why RStudio? RStudio is an integrated development environment (IDE) designed to make your life easier. -- * Organizes scripts, files, plots, code console, ... * Highlights syntax * Helpful interactive graphical interface * Will make an efficient, reproducible workflow *much* easier -- * R Markdown integration... --- # Why R Markdown? * Interface between code, output, and writing * Self-contained analyses * Creates HTML, PDF, slides (like these!), webpages, ... -- * Required for your labs! --- class: inverse .middler[.huge[Part 1: Introduction to R Utilities]] --- # Operators ```{r} # Addition 6 + 3 ``` ```{r} # Subtraction 6 - 3 ``` ```{r} # Multiplication 6 * 3 ``` ```{r} # Division 6 / 3 ``` --- # Comparison Operators ```{r} # Greater than 6 > 3 ``` ```{r} # Less than 6 < 3 ``` ```{r} # Equal to 6 == 3 6 == 3 + 3 ``` --- # Comparison Operators ```{r} # Not equal to 6 != 3 ``` ```{r} 6 < 6 # Less than or equal to 6 <= 6 ``` --- # Logical Operators ```{r, eval = FALSE} # and (6 < 3) & (1 < 3) ``` -- ```{r, echo = FALSE} # and (6 < 3) & (1 < 3) ``` -- ```{r, eval = FALSE} # and (2 < 3) & (1 < 3) ``` -- ```{r, echo = FALSE} # and (2 < 3) & (1 < 3) ``` -- ```{r, eval = FALSE} # or (6 < 3) | (1 < 3) ``` -- ```{r, echo = FALSE} # or (6 < 3) | (1 < 3) ``` -- ```{r, eval = FALSE} # a bit harder... (6 < 3) | (1 < 3) & (6 < 3) ``` -- ```{r, echo = FALSE} # a bit harder... (6 < 3) | (1 < 3) & (6 < 3) ``` --- # Object Types ```{r} class(7) class("7") is.numeric(7) is.numeric("7") ``` --- # Object Types ```{r} is.character(7) is.character("7") is.na(7) is.na(0/0) ``` --- # Object Types ```{r, error = TRUE} as.character(7) as.numeric("7") as.numeric("7") + 3 == 10 "7" + 3 == 10 ``` --- # Assigning Variables ```{r} x <- 7 x x + 3 x == 7 as.character(x) y <- 3 x + y ``` --- # Workspaces ```{r, error = TRUE} # List all defined objects ls() # Remove an object rm("x") ls() x ``` --- # Workspaces ```{r} x <- 7 ls() # Use with caution! This erases everything! rm(list = ls()) ls() ``` --- layout:true # Commenting Code --- ## What is a comment? * Computers completely ignore comments (in R, any line preceded by `#`) * Comments do not impact the functionality of your code at all. -- ### So why do them... -- * Commenting a code allows you to write notes for readers of your code only * Usually, that reader is you! * Coding without comments is ill-advised, bordering on impossible -- * Sneak peak at functions... --- ```{r} #' Wald-type t test #' @param mod an object of class \code{bbdml} #' @return Matrix with wald test statistics and p-values. Univariate tests only. waldt <- function(mod) { # Covariance matrix covMat <- try(chol2inv(chol(hessian(mod))), silent = TRUE) if (class(covMat) == "try-error") { warning("Singular Hessian! Cannot calculate p-values in this setting.") np <- length(mod$param) se <- tvalue <- pvalue <- rep(NA, np) } else { # Standard errors se <- sqrt(diag(covMat)) # test statistic tvalue <- mod$param/se # P-value pvalue <- 2*stats::pt(-abs(tvalue), mod$df.residual) } # make table coef.table <- cbind(mod$param, se, tvalue, pvalue) dimnames(coef.table) <- list(names(mod$param), c("Estimate", "Std. Error", "t value", "Pr(>|t|)")) return(coef.table) } ``` --- ## Comment Style Guide * When starting out, you should comment most lines * Frequent use of comments should allow most comments to be restricted to one line for readability * A comment should go above its corresponding line, be indented equally with the next line, and use a single `#` to mark a comment * Use a string of `-` or `=` to break your code into easily noticeable chunks * Example: `# Data Manipulation -----------` * RStudio allows you to collapse chunks marked like this to help with clutter -- * There are exceptions to every rule! Usually, comments are to help **you**! --- ## Example of when I break my own rules * Here's a snippet of a long mathematical function I wrote (lots code emitted with ellipses for space). * In order to help myself read through it later, I divided the function into major steps marked by easily visible comments I can see when scanning through ```{r, eval = FALSE} objfun <- function(theta, W, M, X, X_star, np, npstar, link, phi.link) { ### STEP 1 - Negative Log-likelihood # extract matrix of betas (np x 1), first np entries b <- utils::head(theta, np) # extract matrix of beta stars (npstar x 1), last npstar entries b_star <- utils::tail(theta, npstar) ... ### STEP 2 - Gradient # define gam gam <- phi/(1 - phi) ``` --- ## A final plea * Being a successful programmer *requires* commenting your code * Want to understand code you wrote >24 hours ago without comments? -- .center[] .center[.small[I just learned you can add gifs to R Markdown slides. Expect a lot of these]] -- * If you still aren't convinced... -- * Clear commenting is required for this course --- layout:false class: inverse .middler[.huge[Part 2: Using RStudio and R Markdown]] --- # RStudio Interface By default... * *Top left*: Editor pane. Browse and edit scripts and data with tabs * *Top right*: List of objects in your Environment (recall `ls()`), code History * *Bottom left*: Console for running R code line-by-line (`>` prompt) * *Bottom right*: Files, plots, packages, help files --- # Editor * Your workflow should be contained here (**not** your console) * Primarily used for writing and editing .R scripts -- * Try opening a file now using *File > New File > R Script*, write two lines of simple code * Click `Run` in the bar above your script. What happens? * Click on one of the lines of code. Press `Ctrl`/`⌘` + `Enter`. What happens? -- .center[**Important:** Every part of your R workflow belongs in this window!] --- layout: true # Environment & History * If you didn't already, define a variable in your R Script and run it * What happens in your Environment tab? -- * Type `install.packages("palmerpenguins")` in your Console. * Now add `library(palmerpenguins)` and `data(penguins)` to your script and run it. * What happens if you click on this in your Environment tab? * Note: We will delve deeper into data later! -- * Remove one of your variables and see what happens. --- * Click on the History tab to see what it contains. Try searching! -- * Select a line from your history and click `To Source`. What happens? -- * Useful for adding lines that you tested in your Console to your scripts -- .pushdown[.center[**Summary:** Useful to quickly browse what you have defined in your environment]] --- layout: false layout: true # Console --- * The quick and easy way to run individual lines of code * Nothing you do here is saved as part of your workflow! -- * Useful for debugging, testing code, iterating a plot until you like it ... * Once you get what you were looking for, add it to your script files! * **Never** manipulate your data in the console. Your workflow should always be **reproducible!** --- ## Incomplete Code What if we start a command, but do not finish it? ```{r, eval = FALSE} > 5 - + ``` Two options: * Press `Esc` to exit and *not* execute the line * Complete the command --- layout: false # Files, Plots, Packages, Help * We will explore this tab more as we get into functions and visualization * Files is used to browse the files on your computer * Useful for opening files/data, moving files you are working with * *Use caution!* Changing files here is the same as changing them on your computer. If you delete something, it's gone! * Plots are used to display plots you create in R * Help is used to browse help files of functions. You can explore these by preceding a function name with `?`. Try `?sqrt` to see. * Packages shows all the packages you currently have installed (we will get more into this later!) --- class: inverse .middler[.huge[Brief Intermission: File Organization]] --- layout:true # File Names Matter --- .middler[] --- .pull-left[ ## Bad * `newfinal2actualFINALnew.docx` * `asdfasdf.R` * `analy$i$ functions!.R` * `stuff.R` * Cluttered * Uninformative * Spaces * Special characters other than `_` and `-` ] .pull-right[ ## Good * `stat302_lab1.Rmd` * `analysis_functions.R` * `analysisFunctions.R` * `2020-01-08_labWriteup.Rmd` * Meaningful * Concise * camelCase or using `_` to distinguish words * Machine sortable ] --- ## Summary * Machine readable * Human readable * Plays well with default ordering -- * `01_draft.Rmd`, `02_draft.Rmd` , ... , `11_draft.Rmd` * `2018-05-05_resume.docx`, `2019-02-17_resume.docx`, `2020-01-08_resume.docx` --- layout: false layout: true # File Organization Matters --- .middler[] --- Easier to start with best practice rather than fix things later! -- 1. Somewhere on your computer, create the folder `STAT302` 2. Within that folder, create the subfolders `short_labs`<sup>1</sup>, `labs`, `projects` 3. Within your Short Labs folder, create a subfolder `short_lab_1`<sup>2</sup> 4. Put your both of short lab files from Monday into that folder 5. Within your Labs folder, create a folder for Lab 1 that follows the filename guide .footnote[[1] or `shortLabs`, `ShortLabs`, `Short_Labs`, ... (just follow the rules for file names!) [2] or `shortLab1`, `short_lab1`, ... ] -- .pushdown[May seem excessive for now, but this will come in handy when labs start including extra files such as data and figures!] --- layout: false # All done! For now... .middler[] --- layout: true # R Markdown --- Let's try making an R Markdown file: 1. Choose *File > New File > R Markdown...* 2. Make sure *HTML Output* is selected and click OK 3. Save the file in your new folder, call it `stat302_Lab1.Rmd` * *Hint:* Follow along, because this will become your Lab 1 submission! 4. Click the *Knit HTML* button * After it is done, browse to the file location using the `Files` tab. What do you notice? * Click *Open in Browser* to view the full HTML --- ## R Markdown Headers The header of .Rmd files is YAML (YAML Ain't Markup Language) code 5. Change `title` to "Lab 1" 6. Change `author` to your name in quotes 7. Change `date` to the due date in quotes -- Congrats! You have a functional .Rmd that will soon be your Lab 1 submissions! --- ## R Markdown Syntax (Thanks to Charles Lanfear, UW Sociology, for this very concise summary) --- .pull-left[ ## Output **bold/strong emphasis** *italic/normal emphasis* .forcehead[Header] ## Subheader ### Subsubheader ] .pull-right[ ## Syntax <pre> **bold/strong emphasis** *italic/normal emphasis* # Header ## Subheader ### Subsubheader </pre> ] --- .pull-left[ ## Output 1. Ordered lists 1. Are real easy 1. Even with sublists 1. Or when lazy with numbering * Unordered lists * Are also real easy + Also even with sublists [URLs are trivial](http://www.uw.edu)  ] .pull-right[ ## Syntax <div style="width:400px;overflow:auto"> <pre> 1. Ordered lists 1. Are real easy 1. Even with sublists 1. Or when lazy with numbering * Unordered lists * Are also real easy + Also even with sublists [URLs are trivial](http://www.uw.edu)  </div> </pre> ] --- .pull-left[ ## Output You can put some math $y= \left( \frac{2}{3} \right)^2$ right up in there. $$\frac{1}{n} \sum_{i=1}^{n} x_i = \bar{x}_n$$ Or a sentence with `code-looking font`. Or a block of code: ``` y <- 1:5 z <- y^2 ``` ] .pull-right[ ## Syntax <div style="width:400px;overflow:auto"> <pre> You can put some math $y= \left(\frac{2}{3} \right)^2$ right up in there $$\frac{1}{n} \sum_{i=1}^{n} x_i = \bar{x}_n$$ Or a sentence with `code-looking font`. Or a block of code: ``` y <- 1:5 z <- y^2 ``` </pre> ] </div> --- ## Helpful Links * [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/) * [R Markdown Cheatsheet](https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf) --- ## R Code within R Markdown As you saw in Short Lab 1, we can run and execute R code within R Markdown. To do so encase your code as follows. `r ''````{r, eval = TRUE, echo = TRUE} # Your code goes here! ``` You can click the green triangle in the corner to evaluate that code chunk to preview the results without compiling the entire document --- ## Useful Code Chunk Parameters Parameters go into the opening brackets `{r}` and are separated by commas. Here are some you might find useful (checkout the guide links above for more): * `echo=FALSE`: Hide R code but keep results * `eval=FALSE`: Do not execute the R code * `include=FALSE`: Hides all output (useful to load packages at the beginning of your document) * `cache=TRUE`: Stores the results of the chunk, and only re-runs if the chunk is changed. Useful for files that take a while to compile * `fig.height=5, fig.width=5`: modify the dimensions of any plots that are generated in the chunk (units are in inches) --- ## In-Line R Code ```{r, echo=FALSE} library(knitr) ``` You can also include and execute R code directly in the text of your .Rmd! For example, say we define a variable ```{r} x <- 7 ``` If I want to reference this variable in text, I can do so directly by writing using ticks and starting with r. So if I type: The variable I want to reference is `r inline_expr("x", "md")`. what will appear is: The variable I want to reference is `r x`. --- ## In-Line R Code * This allows you to easily see where your values came from! * This prevents any typos in translating coding results to text! * This allows you to modify your analysis without needing to copy and paste updated results into your text! --- layout: false class: inverse .middler[.huge[Part 3: Data Types]] --- layout: true # Vectors --- * A **vector** is a set of values of the same type * We create vectors using the function `c()` ```{r} c(16, 3, 0, 7, -2) ``` * We can shorthand vectors counting up (or down) using `:` ```{r} 1:5 ``` --- * We can also generate vectors using functions such as `rep()` and `seq()` ```{r} # Sequence from 1 to 20, incrementing by 5 seq(1, 20, by = 5) # Repeat each element of a vector 3 times each rep(c(1, 2), each = 3) # Repeat an entire vector 3 times rep(c(1, 2), 3) ``` --- * We index vectors using `[index]` after the vector name ```{r} x <- 1:5 x[3] ``` * If we use a negative index, we return the vector with that element removed ```{r} x[-4] ``` --- ## Vector Arithmetic **Vectorization**, or applying functions across vectors/arrays, is one of R's most powerful capabilities ```{r} y <- -5:-1 y x + y x * y ``` --- Be careful! R **recycles**, repeating elements of shorter vectors to match longer vectors. This is incredibly useful when done on purpose, but can also easily lead to hard-to-catch bugs in your code! ```{r} 2 * x c(1, -1) * x c(1, -1) + x ``` --- We can apply many functions component-wise to vectors, including comparison operators. ```{r} x >= 3 y < -2 (x >= 3) & (y < -2) x == c(1, 3, 2, 4, 5) ``` --- ## Boolean Vectors In code, entries that are `TRUE` or `FALSE` are called **booleans**. These are incredibly important, because they can be used to give your computer conditions. What will the following code do? ```{r, eval = FALSE} x[x > 3] <- 3 x ``` -- ```{r, echo = FALSE} x[x > 3] <- 3 x ``` --- ## Boolean Vectors We can also do basic arithmetic with booleans. `TRUE` is encoded as `1` and `FALSE` is encoded as `0`. ```{r, eval = FALSE} # First reset x x <- 1:5 sum(x >= 3) ``` -- ```{r, echo = FALSE} # First reset x x <- 1:5 sum(x >= 3) ``` -- ```{r, eval = FALSE} mean(x >= 3) ``` -- ```{r, echo = FALSE} mean(x >= 3) ``` -- What is this last quantity telling us? -- By taking the mean, we are looking at the **proportion** of our vector that is `TRUE`! --- We can also get more complicated with our indexing. ```{r} # Return the second and third elements of y[c(2, 3)] # Return the values of x greater than 3 x[x >= 3] # Values of x that match the index of the values of y that are less than -2 x[y < -2] # which() returns the index of entries that are TRUE which(y < -2) ``` --- We can compare entire vectors using `identical()` ```{r} identical(x, -rev(y)) ``` What do you think the function `rev()` is doing in the code above? *Hint:* Use `?rev` to read the help files for the function --- ## Vector Data Types Note that vectors can only have one type of data. So we can do ```{r} c(1, 2, 3) c("a", "b", "c") ``` but when we try ```{r} c(1, "b", 3) ``` R will force the entries vector to be of the same type! This is a common source of bugs. --- ## Names We can assign names to the entries of our vectors using `names()`. This can be useful to label our data. Note that arithmetic doesn't change the names of our elements. ```{r} my_vec <- c(1, 2, 3) names(my_vec) <- c("a", "b", "c") my_vec my_vec + 1 ``` We can then access the names as their own vector by calling `names()` again. ```{r} names(my_vec) ``` --- ## Useful functions for vectors * `max()`, `min()`, `mean()`, `median()`, `sum()`, `sd()`, `var()` * `length()` returns the number of elements in the vector * `head()` and `tail()` return the beginning and end vectors * `sort()` will sort * `summary()` returns a 5-number summary * `any()` and `all()` to check conditions on Boolean vectors * `hist()` will return a crude histogram (we'll learn how to make this nicer later) You will need some of these for Lab 1! If you are unclear about what any of them do, use `?` before the function name to read the documentation. You should get in the habit of checking function documentation a lot! --- layout: false layout: true # Matrices --- * **Matrices** are two-dimensional extension of vectors, they have **rows** and **columns** * We can create a matrix using the function `matrix()` ```{r} my_matrix <- matrix(c(x, y), nrow = 2, ncol = 5, byrow = TRUE) my_matrix # Note: byrow = FALSE is the default my_matrix2 <- matrix(c(x, y), nrow = 2, ncol = 5) my_matrix2 ``` .center[*Warning:* be careful not to call your matrix `matrix`! Why not?] --- We can also generate matrices by column binding (`cbind()`) and row binding (`rbind()`) vectors ```{r} cbind(x, y) rbind(x, y) ``` --- ## Indexing and Subsetting Matrices Indexing a matrix is similar to indexing a vector, except we must index both the row and column, in that order. ```{r} my_matrix ``` ```{r, eval = FALSE} my_matrix[2, 3] ``` -- ```{r, echo = FALSE} my_matrix[2, 3] ``` -- ```{r, eval = FALSE} my_matrix[2, c(1, 3, 5)] ``` -- ```{r, echo = FALSE} my_matrix[2, c(1, 3, 5)] ``` --- Also similarly to vectors, we can subset using a negative index. ```{r} my_matrix my_matrix[-2, -4] # Note: Leaving an index blank includes all indices my_matrix[, -c(1, 3, 4, 5)] ``` --- ```{r} my_matrix[, -c(1, 3, 4, 5)] is.matrix(my_matrix[, -c(1, 3, 4, 5)]) ``` What happened here? When subsetting a matrix reduces one dimension to length 1, R automatically coerces it into a vector. We can prevent this by including `drop = FALSE`. ```{r} my_matrix[, -c(1, 3, 4, 5), drop = FALSE] is.matrix(my_matrix[, -c(1, 3, 4, 5), drop = FALSE]) ``` --- ## Filling in a Matrix We can fill in a matrix using indices. This is commonly done in statistical computing. In R, you should always start by initializing an empty matrix of the right size. ```{r} my_results <- matrix(NA, nrow = 3, ncol = 3) my_results ``` --- Then I can replace a single row (or column) using indices as follows. ```{r} my_results[2, ] <- c(2, 4, 3) my_results ``` We can also fill in multiple rows (or columns) at once. (Likewise, we can also do subsets of rows/columns, or unique entries). Note that **recycling** applies here. ```{r} my_results[c(1, 3), ] <- 7 my_results ``` --- ## Matrix Entry Types Matrices, like vectors, can only have entries of one type. ```{r} rbind(c(1, 2, 3), c("a", "b", "c")) ``` --- ## Functions on Matrices Let's create 3 matrices for the purposes of demonstrating matrix functions. ```{r} mat1 <- matrix(1:6, nrow = 2, ncol = 3, byrow = TRUE) mat1 mat2 <- matrix(1:6, nrow = 3, ncol = 2) mat2 ``` --- ```{r} mat3 <- matrix(5:10, nrow = 2, ncol = 3, byrow = TRUE) mat3 ``` --- ### Matrix Sums `+` ```{r} mat1 + mat3 ``` ### Element-wise Matrix Multiplication `*` ```{r} mat1 * mat3 ``` --- ### Matrix Multiplication `%*%` ```{r} mat_square <- mat1 %*% mat2 mat_square ``` ### Column Bind Matrices `cbind()` ```{r} cbind(mat1, mat3) ``` --- ### Transpose `t()` ```{r} t(mat1) ``` ### Column Sums `colSums()` ```{r} colSums(mat1) ``` ### Row Sums `rowSums()` ```{r} rowSums(mat1) ``` --- ### Column Means `colMeans()` ```{r} colMeans(mat1) ``` ### Row Means `rowMeans()` ```{r} rowMeans(mat1) ``` ### Dimensions `dim()` ```{r} dim(mat1) ``` --- ### Determinent `det()` ```{r} det(mat_square) ``` ### Matrix Inverse `solve()` ```{r} solve(mat_square) ``` ### Matrix Diagonal `diag()` ```{r} diag(mat_square) ``` --- ## A note on `diag()` `diag()` can also be used to generate diagonal matrices by supplying a vector ```{r} diag(c(1, 2, 3)) ``` Supplying an integer will produce an identity matrix of that dimension ```{r} diag(3) ``` --- ## Names We can assign names to the rows and columns, using `rownames()` and `colnames()`, respectively. Similarly to `names()` for vectors, we then access them by calling the function again. ```{r} colnames(mat1) <- c("var1", "var2", "var3") rownames(mat1) <- c("sample1", "sample2") mat1 mat1 * 2 ``` --- ## Names We can assign names to the rows and columns, using `rownames()` and `colnames()`, respectively. Similarly to `names()` for vectors, we then access them by calling the function again. ```{r} rownames(mat1) colnames(mat1) ``` --- ## Tables in R Markdown It is easy to go from matrices to tables using R Markdown. There are several methods (check the cheatsheet link and Google for alternatives). I will present one easy method here, but what you use is up to you! ```{r, eval = FALSE} # We need to load the knitr and kableExtra package library(knitr) library(kableExtra) ``` ```{r} my_tab <- data.frame(mat1) kable_styling(kable(mat1)) ``` What happened with the row and column names? --- layout: false class: inverse .middler[.huge[Part 4: R Coding Style Guide]] --- layout: true # R Coding Style Guide --- ## Who are you to tell me how to type? We will be using a mix of the [Tidyverse Style Guide](https://style.tidyverse.org/) by Hadley Wickham and the [Google Style Guide](https://google.github.io/styleguide/Rguide.html). Please see the links for details, but I will summarize some main points here and throughout the class as we learn more functionality, such as functions and packages. You will be graded on following good code style! --- ## Object Names Use either underscores (`_`) or big camel case (`BigCamelCase`) to separate words within an object name. Do not use dots `.` to separate words in R functions! ```{r, eval = FALSE} # Good day_one day_1 DayOne # Bad dayone ``` --- ## Object Names Names should be concise, meaningful, and (generally) nouns. ```{r, eval = FALSE} # Good day_one # Bad first_day_of_the_month djm1 ``` --- ## Object Names It is *very* important that object names do not write over common functions! ```{r, eval = FALSE} # Very extra super bad c <- 7 t <- 23 T <- FALSE mean <- "something" ``` Note: `T` and `F` are R shorthand for `TRUE` and `FALSE`, respectively. In general, spell them out to be as clear as possible. --- ## Spacing Put a space after every comma, just like in English writing. ```{r, eval = FALSE} # Good x[, 1] # Bad x[,1] x[ ,1] x[ , 1] ``` Do not put spaces inside or outside parentheses for regular function calls. ```{r, eval = FALSE} # Good mean(x, na.rm = TRUE) # Bad mean (x, na.rm = TRUE) mean( x, na.rm = TRUE ) ``` --- ## Spacing with Operators Most of the time when you are doing math, conditionals, logicals, or assignment, your operators should be surrounded by spaces. (e.g. for `==`, `+`, `-`, `<-`, etc.) ```{r, eval = FALSE} # Good height <- (feet * 12) + inches mean(x, na.rm = 10) # Bad height<-feet*12+inches mean(x, na.rm=10) ``` There are some exceptions we will learn more about later, such as the power symbol `^`. See the [Tidyverse Style Guide](https://style.tidyverse.org/) for more details! --- ## Extra Spacing Adding extra spaces ok if it improves alignment of `=` or `<-`. ```{r, eval = FALSE} # Good list( total = a + b + c, mean = (a + b + c) / n ) # Also fine list( total = a + b + c, mean = (a + b + c) / n ) ``` --- ## Long Lines of Code Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font. If a function call is too long to fit on a single line, use one line each for the function name, each argument, and the closing `)`. This makes the code easier to read and to change later. ```{r, eval = FALSE} # Good do_something_very_complicated( something = "that", requires = many, arguments = "some of which may be long" ) # Bad do_something_very_complicated("that", requires, many, arguments, "some of which may be long" ) ``` *Tip! Try RStudio > Preferences > Code > Display > Show Margin with Margin column 80 to give yourself a visual cue!* --- ## Assignment We use `<-` instead of `=` for assignment. This is moderately controversial if you find yourself in the right (wrong?) communities. ```{r, eval = FALSE} # Good x <- 5 # Bad x = 5 ``` --- ## Semicolons In R, semi-colons (`;`) are used to execute pieces of R code on a single line. In general, this is bad practice and should be avoided. Also, you never need to end lines of code with semi-colons! ```{r, eval = FALSE} # Bad a <- 2; b <- 3 # Also bad a <- 2; b <- 3; # Good a <- 2 b <- 3 ``` --- ## Quotes and Strings Use `"`, not `'`, for quoting text. The only exception is when the text already contains double quotes and no single quotes. ```{r, eval = FALSE} # Bad 'Text' 'Text with "double" and \'single\' quotes' # Good "Text" 'Text with "quotes"' '<a href="http://style.tidyverse.org">A link</a>' ``` --- Phew! All done for now. Follow these rules and your code will be looking .middler[]