---
title: "STAT 302, Lecture Slides 1"
subtitle: "Introduction and Getting Started with R"
author: "Bryan Martin"
date: ""
output:
xaringan::moon_reader:
css: ["default", "metropolis", "metropolis-fonts", "my-theme.css"]
lib_dir: libs
nature:
highlightStyle: tomorrow-night-bright
highlightLines: true
countIncrementalSlides: false
titleSlideClass: ["center","top"]
---
```{r setup, include=FALSE, purl=FALSE}
options(htmltools.dir.version = FALSE)
knitr::opts_chunk$set(comment = "##")
library(kableExtra)
```
# Outline
1. Course Overview
2. Introduction to R, RStudio, and R Markdown
3. Getting started with data
.middler[**Goal:** Create a functional R Markdown document utilizing basic R functionality (Short Lab 1)]
---
# Syllabus
.middler[.large[
[Link to syllabus](https://bryandmartin.github.io/STAT302/syllabus.html)
]]
---
# Expectations
What you should expect from me...
* your learning will be my priority
* you will be treated like an adult and with respect
* your feedback will be valued
* timely feedback on assignments
* understandable and well-paced lectures (tell me if they are not!)
* an attempt at making statistical computation as fun for you as it is for me!
---
# Expectations
What I will expect from you...
* regular attendance
* timely assignments that represent your best work
* a respectful and engaged classroom
* a desire and effort to learn challenging material
---
# Canvas Discussion
* Worth up to 2% extra credit on your final grade!
* Substantive and helpful questions and answers
--
.pull-left[
### Bad questions:
* How do you do problem 2?
* Here's my code and it's broken. How do I fix it?
]
.pull-right[
### Good questions:
* Here's a snippet of code I used for problem 2:
`formatted code snippet`
It returned the following error:
`formatted error message`
Does anyone know why? I already tried...
* I don't understand the concept from Slide 18 today. Could anyone elaborate on why...
]
---
# Canvas Discussion
* Worth up to 2% extra credit on your final grade!
* Substantive and helpful questions and answers
.pull-left[
### Bad answers:
* Here's my solution
* Fool! You should already know the answer to this! Your trivial question is no match for my superior intellect!
]
.pull-right[
### Good answers:
* This error message occurs because your variable is a string instead of a numeric.
Have you tried checking...
* I think you have a bug in line 3 of the code you posted. You have more left parentheses than right parentheses so the line is incomplete.
]
---
# Why R?
R is a programming language designed for statistical analysis.
--
* open-source
* free
* large and active community of developers and users
* great analysis tools
* great visualization tools
--
* great user interface...
---
# Why RStudio?
RStudio is an integrated development environment (IDE) designed to make your life easier.
--
* Organizes scripts, files, plots, code console, ...
* Highlights syntax
* Helpful interactive graphical interface
* Will make an efficient, reproducible workflow *much* easier
--
* R Markdown integration...
---
# Why R Markdown?
* Interface between code, output, and writing
* Self-contained analyses
* Creates HTML, PDF, slides (like these!), webpages, ...
--
* Required for your labs!
---
class: inverse
.middler[.huge[Part 1: Introduction to R Utilities]]
---
# Operators
```{r}
# Addition
6 + 3
```
```{r}
# Subtraction
6 - 3
```
```{r}
# Multiplication
6 * 3
```
```{r}
# Division
6 / 3
```
---
# Comparison Operators
```{r}
# Greater than
6 > 3
```
```{r}
# Less than
6 < 3
```
```{r}
# Equal to
6 == 3
6 == 3 + 3
```
---
# Comparison Operators
```{r}
# Not equal to
6 != 3
```
```{r}
6 < 6
# Less than or equal to
6 <= 6
```
---
# Logical Operators
```{r, eval = FALSE}
# and
(6 < 3) & (1 < 3)
```
--
```{r, echo = FALSE}
# and
(6 < 3) & (1 < 3)
```
--
```{r, eval = FALSE}
# and
(2 < 3) & (1 < 3)
```
--
```{r, echo = FALSE}
# and
(2 < 3) & (1 < 3)
```
--
```{r, eval = FALSE}
# or
(6 < 3) | (1 < 3)
```
--
```{r, echo = FALSE}
# or
(6 < 3) | (1 < 3)
```
--
```{r, eval = FALSE}
# a bit harder...
(6 < 3) | (1 < 3) & (6 < 3)
```
--
```{r, echo = FALSE}
# a bit harder...
(6 < 3) | (1 < 3) & (6 < 3)
```
---
# Object Types
```{r}
class(7)
class("7")
is.numeric(7)
is.numeric("7")
```
---
# Object Types
```{r}
is.character(7)
is.character("7")
is.na(7)
is.na(0/0)
```
---
# Object Types
```{r, error = TRUE}
as.character(7)
as.numeric("7")
as.numeric("7") + 3 == 10
"7" + 3 == 10
```
---
# Assigning Variables
```{r}
x <- 7
x
x + 3
x == 7
as.character(x)
y <- 3
x + y
```
---
# Workspaces
```{r, error = TRUE}
# List all defined objects
ls()
# Remove an object
rm("x")
ls()
x
```
---
# Workspaces
```{r}
x <- 7
ls()
# Use with caution! This erases everything!
rm(list = ls())
ls()
```
---
layout:true
# Commenting Code
---
## What is a comment?
* Computers completely ignore comments (in R, any line preceded by `#`)
* Comments do not impact the functionality of your code at all.
--
### So why do them...
--
* Commenting a code allows you to write notes for readers of your code only
* Usually, that reader is you!
* Coding without comments is ill-advised, bordering on impossible
--
* Sneak peak at functions...
---
```{r}
#' Wald-type t test
#' @param mod an object of class \code{bbdml}
#' @return Matrix with wald test statistics and p-values. Univariate tests only.
waldt <- function(mod) {
# Covariance matrix
covMat <- try(chol2inv(chol(hessian(mod))), silent = TRUE)
if (class(covMat) == "try-error") {
warning("Singular Hessian! Cannot calculate p-values in this setting.")
np <- length(mod$param)
se <- tvalue <- pvalue <- rep(NA, np)
} else {
# Standard errors
se <- sqrt(diag(covMat))
# test statistic
tvalue <- mod$param/se
# P-value
pvalue <- 2*stats::pt(-abs(tvalue), mod$df.residual)
}
# make table
coef.table <- cbind(mod$param, se, tvalue, pvalue)
dimnames(coef.table) <- list(names(mod$param),
c("Estimate", "Std. Error", "t value", "Pr(>|t|)"))
return(coef.table)
}
```
---
## Comment Style Guide
* When starting out, you should comment most lines
* Frequent use of comments should allow most comments to be restricted to one line for
readability
* A comment should go above its corresponding line, be indented equally with the next line, and use a single `#` to mark a comment
* Use a string of `-` or `=` to break your code into easily noticeable chunks
* Example: `# Data Manipulation -----------`
* RStudio allows you to collapse chunks marked like this to help with clutter
--
* There are exceptions to every rule! Usually, comments are to help **you**!
---
## Example of when I break my own rules
* Here's a snippet of a long mathematical function I wrote (lots code emitted with ellipses for space).
* In order to help myself read through it later, I divided the function into major steps marked by easily visible comments I can see when scanning through
```{r, eval = FALSE}
objfun <- function(theta, W, M, X, X_star, np, npstar, link, phi.link) {
### STEP 1 - Negative Log-likelihood
# extract matrix of betas (np x 1), first np entries
b <- utils::head(theta, np)
# extract matrix of beta stars (npstar x 1), last npstar entries
b_star <- utils::tail(theta, npstar)
...
### STEP 2 - Gradient
# define gam
gam <- phi/(1 - phi)
```
---
## A final plea
* Being a successful programmer *requires* commenting your code
* Want to understand code you wrote >24 hours ago without comments?
--
.center[]
.center[.small[I just learned you can add gifs to R Markdown slides. Expect a lot of these]]
--
* If you still aren't convinced...
--
* Clear commenting is required for this course
---
layout:false
class: inverse
.middler[.huge[Part 2: Using RStudio and R Markdown]]
---
# RStudio Interface
By default...
* *Top left*: Editor pane. Browse and edit scripts and data with tabs
* *Top right*: List of objects in your Environment (recall `ls()`), code History
* *Bottom left*: Console for running R code line-by-line (`>` prompt)
* *Bottom right*: Files, plots, packages, help files
---
# Editor
* Your workflow should be contained here (**not** your console)
* Primarily used for writing and editing .R scripts
--
* Try opening a file now using *File > New File > R Script*, write two lines of simple code
* Click `Run` in the bar above your script. What happens?
* Click on one of the lines of code. Press `Ctrl`/`⌘` + `Enter`. What happens?
--
.center[**Important:** Every part of your R workflow belongs in this window!]
---
layout: true
# Environment & History
* If you didn't already, define a variable in your R Script and run it
* What happens in your Environment tab?
--
* Type `install.packages("palmerpenguins")` in your Console.
* Now add `library(palmerpenguins)` and `data(penguins)` to your script and run it.
* What happens if you click on this in your Environment tab?
* Note: We will delve deeper into data later!
--
* Remove one of your variables and see what happens.
---
* Click on the History tab to see what it contains. Try searching!
--
* Select a line from your history and click `To Source`. What happens?
--
* Useful for adding lines that you tested in your Console to your scripts
--
.pushdown[.center[**Summary:** Useful to quickly browse what you have defined in your environment]]
---
layout: false
layout: true
# Console
---
* The quick and easy way to run individual lines of code
* Nothing you do here is saved as part of your workflow!
--
* Useful for debugging, testing code, iterating a plot until you like it ...
* Once you get what you were looking for, add it to your script files!
* **Never** manipulate your data in the console.
Your workflow should always be **reproducible!**
---
## Incomplete Code
What if we start a command, but do not finish it?
```{r, eval = FALSE}
> 5 -
+
```
Two options:
* Press `Esc` to exit and *not* execute the line
* Complete the command
---
layout: false
# Files, Plots, Packages, Help
* We will explore this tab more as we get into functions and visualization
* Files is used to browse the files on your computer
* Useful for opening files/data, moving files you are working with
* *Use caution!* Changing files here is the same as changing them on your computer. If you delete something, it's gone!
* Plots are used to display plots you create in R
* Help is used to browse help files of functions. You can explore these by preceding a function name with `?`. Try `?sqrt` to see.
* Packages shows all the packages you currently have installed (we will get more into this later!)
---
class: inverse
.middler[.huge[Brief Intermission: File Organization]]
---
layout:true
# File Names Matter
---
.middler[]
---
.pull-left[
## Bad
* `newfinal2actualFINALnew.docx`
* `asdfasdf.R`
* `analy$i$ functions!.R`
* `stuff.R`
* Cluttered
* Uninformative
* Spaces
* Special characters other than `_` and `-`
]
.pull-right[
## Good
* `stat302_lab1.Rmd`
* `analysis_functions.R`
* `analysisFunctions.R`
* `2020-01-08_labWriteup.Rmd`
* Meaningful
* Concise
* camelCase or using `_` to distinguish words
* Machine sortable
]
---
## Summary
* Machine readable
* Human readable
* Plays well with default ordering
--
* `01_draft.Rmd`, `02_draft.Rmd` , ... , `11_draft.Rmd`
* `2018-05-05_resume.docx`, `2019-02-17_resume.docx`, `2020-01-08_resume.docx`
---
layout: false
layout: true
# File Organization Matters
---
.middler[]
---
Easier to start with best practice rather than fix things later!
--
1. Somewhere on your computer, create the folder `STAT302`
2. Within that folder, create the subfolders `short_labs`1, `labs`, `projects`
3. Within your Short Labs folder, create a subfolder `short_lab_1`2
4. Put your both of short lab files from Monday into that folder
5. Within your Labs folder, create a folder for Lab 1 that follows the filename guide
.footnote[[1] or `shortLabs`, `ShortLabs`, `Short_Labs`, ... (just follow the rules for file names!)
[2] or `shortLab1`, `short_lab1`, ...
]
--
.pushdown[May seem excessive for now, but this will come in handy when labs start
including extra files such as data and figures!]
---
layout: false
# All done! For now...
.middler[]
---
layout: true
# R Markdown
---
Let's try making an R Markdown file:
1. Choose *File > New File > R Markdown...*
2. Make sure *HTML Output* is selected and click OK
3. Save the file in your new folder, call it `stat302_Lab1.Rmd`
* *Hint:* Follow along, because this will become your Lab 1 submission!
4. Click the *Knit HTML* button
* After it is done, browse to the file location using the `Files` tab. What do you notice?
* Click *Open in Browser* to view the full HTML
---
## R Markdown Headers
The header of .Rmd files is YAML (YAML Ain't Markup Language) code
5. Change `title` to "Lab 1"
6. Change `author` to your name in quotes
7. Change `date` to the due date in quotes
--
Congrats! You have a functional .Rmd that will soon be your Lab 1 submissions!
---
## R Markdown Syntax
(Thanks to Charles Lanfear, UW Sociology, for this very concise summary)
---
.pull-left[
## Output
**bold/strong emphasis**
*italic/normal emphasis*
.forcehead[Header]
## Subheader
### Subsubheader
]
.pull-right[
## Syntax
**bold/strong emphasis** *italic/normal emphasis* # Header ## Subheader ### Subsubheader] --- .pull-left[ ## Output 1. Ordered lists 1. Are real easy 1. Even with sublists 1. Or when lazy with numbering * Unordered lists * Are also real easy + Also even with sublists [URLs are trivial](http://www.uw.edu)  ] .pull-right[ ## Syntax
1. Ordered lists 1. Are real easy 1. Even with sublists 1. Or when lazy with numbering * Unordered lists * Are also real easy + Also even with sublists [URLs are trivial](http://www.uw.edu) 
You can put some math $y= \left(\frac{2}{3}
\right)^2$ right up in there
$$\frac{1}{n} \sum_{i=1}^{n}
x_i = \bar{x}_n$$
Or a sentence with `code-looking font`.
Or a block of code:
```
y <- 1:5
z <- y^2
```
]