---
title: "STAT 302: Lecture 0"
subtitle: "Course Overview"
author: "Peter Gao (adapted from slides by Bryan Martin)"
date: ""
output:
xaringan::moon_reader:
css: ["default", "gao-theme.css", "gao-theme-fonts.css"]
nature:
highlightStyle: default
highlightLines: true
countIncrementalSlides: false
titleSlideClass: ["center"]
---
```{r setup, include=FALSE, purl=FALSE}
options(htmltools.dir.version = FALSE)
knitr::opts_chunk$set(comment = "##")
library(kableExtra)
```
# Outline
1. Course Overview
2. Introductions
3. What is R? What is statistical computing?
4. R Basics
5. Short Lab 0
---
# Syllabus
.middler[.large[
[Link to syllabus](https://peteragao.github.io/STAT302-AUT2021/syllabus.html)
]]
---
# Introductions
* What should we call you?
* Why are you taking this class?
* One goal for the school year
---
# Collaboration
* You may discuss problems, approaches, and solutions with your **classmates**.
* You must credit anyone with whom you worked on each assignment.
* All submitted work must be your own; you should not submit code or answers copied from any resource including your classmates.
---
# Piazza Discussion
* Worth up to 2% extra credit on your final grade!
* Substantive and helpful questions and answers
--
.pull-left[
### Bad questions:
* How do you do problem 2?
* Here's my code and it's broken. How do I fix it?
]
.pull-right[
### Good questions:
* Here's a snippet of code I used for problem 2:
`formatted code snippet`
It returned the following error:
`formatted error message`
Does anyone know why? I already tried...
* I don't understand the concept from Slide 18 today. Could anyone elaborate on why...
]
---
# Piazza Discussion
* Worth up to 2% extra credit on your final grade!
* Substantive and helpful questions and answers
.pull-left[
### Bad answers:
* this is sooooo easy, here's my solution
]
.pull-right[
### Good answers:
* This error message occurs because your variable is a string instead of a numeric.
Have you tried checking...
* I think you have a bug in line 3 of the code you posted. You have more left parentheses than right parentheses so the line is incomplete.
]
---
# What is statistical computing?
--
.middler[.large[statistics + computing]]
---
# What is statistical computing?
In this class we will discuss some basic computer science concepts, but we will emphasize skills for utilizing computers to aid in data analysis.
--
Advances in computation have enabled advances at every step of the data analysis pipeline:
* Data collection, storage, and sharing
* Exploratory data analysis and visualization
* Statistical inference and prediction
* Simulation
* Communication and distribution of results
---
# Why R?
R is a programming language designed for statistical analysis.
--
* open-source
* free
* large and active community of developers and users
* great analysis tools
* great visualization tools
--
* great user interface...
---
# Why RStudio?
RStudio is an integrated development environment (IDE) designed to make your life easier.
--
* Organizes scripts, files, plots, code console, ...
* Highlights syntax
* Helpful interactive graphical interface
* Will make an efficient, reproducible workflow *much* easier
--
* R Markdown integration...
---
# Why R Markdown?
* Combine code, output, and writing
* Self-contained analyses
* Creates HTML, PDF, slides (like these!), webpages, ...
--
* Required for your labs!
---
class: inverse
.middler[.huge[Part 1: Introduction to R Utilities]]
---
# Operators
```{r}
# Addition
6 + 3
```
```{r}
# Subtraction
6 - 3
```
```{r}
# Multiplication
6 * 3
```
```{r}
# Division
6 / 3
```
---
# Comparison Operators
```{r}
# Greater than
6 > 3
```
```{r}
# Less than
6 < 3
```
```{r}
# Equal to
6 == 3
6 == 3 + 3
```
---
# Comparison Operators
```{r}
# Not equal to
6 != 3
```
```{r}
6 < 6
# Less than or equal to
6 <= 6
```
---
# Logical Operators
```{r, eval = FALSE}
# and
(6 < 3) & (1 < 3)
```
--
```{r, echo = FALSE}
# and
(6 < 3) & (1 < 3)
```
--
```{r, eval = FALSE}
# and
(2 < 3) & (1 < 3)
```
--
```{r, echo = FALSE}
# and
(2 < 3) & (1 < 3)
```
--
```{r, eval = FALSE}
# or
(6 < 3) | (1 < 3)
```
--
```{r, echo = FALSE}
# or
(6 < 3) | (1 < 3)
```
--
```{r, eval = FALSE}
# a bit harder...
(6 < 3) | (1 < 3) & (6 < 3)
```
--
```{r, echo = FALSE}
# a bit harder...
(6 < 3) | (1 < 3) & (6 < 3)
```
---
# Object Types
```{r}
class(7)
class("7")
is.numeric(7)
is.numeric("7")
```
---
# Object Types
```{r}
is.character(7)
is.character("7")
is.na(7)
is.na(0/0)
```
---
# Object Types
```{r, error = TRUE}
as.character(7)
as.numeric("7")
as.numeric("7") + 3 == 10
"7" + 3 == 10
```
---
# Assigning Variables
```{r}
x <- 7
x
x + 3
x == 7
as.character(x)
y <- 3
x + y
```
---
# Workspaces
```{r, error = TRUE}
# List all defined objects
ls()
# Remove an object
rm("x")
ls()
x
```
---
# Workspaces
```{r}
x <- 7
ls()
# Use with caution! This erases everything!
rm(list = ls())
ls()
```
---
layout:false
class: inverse
.middler[.huge[Part 2: Using RStudio and R Markdown]]
---
# RStudio Interface
By default...
* *Top left*: Editor pane. Browse and edit scripts and data with tabs
* *Top right*: List of objects in your Environment (recall `ls()`), code History
* *Bottom left*: Console for running R code line-by-line (`>` prompt)
* *Bottom right*: Files, plots, packages, help files
---
# Editor
* Your workflow should be contained here (**not** your console)
* Primarily used for writing and editing .R scripts
--
* Try opening a file now using *File > New File > R Script*, write two lines of simple code
* Click `Run` in the bar above your script. What happens?
* Click on one of the lines of code. Press `Ctrl`/`⌘` + `Enter`. What happens?
--
.center[**Important:** Every part of your R workflow belongs in this window!]
---
layout: true
# Environment & History
* If you didn't already, define a variable in your R Script and run it
* What happens in your Environment tab?
--
* Type `install.packages("palmerpenguins")` in your Console.
* Now add `library(palmerpenguins)` and `data(penguins)` to your script and run it.
* What happens if you click on this in your Environment tab?
* Note: We will delve deeper into data later!
--
* Remove one of your variables and see what happens.
---
* Click on the History tab to see what it contains. Try searching!
--
* Select a line from your history and click `To Source`. What happens?
--
* Useful for adding lines that you tested in your Console to your scripts
--
.pushdown[.center[**Summary:** Useful to quickly browse what you have defined in your environment]]
---
layout: false
layout: true
# Console
---
* The quick and easy way to run individual lines of code
* Nothing you do here is saved as part of your workflow!
--
* Useful for debugging, testing code, iterating a plot until you like it ...
* Once you get what you were looking for, add it to your script files!
* **Never** manipulate your data in the console.
Your workflow should always be **reproducible!**
---
## Incomplete Code
What if we start a command, but do not finish it?
```{r, eval = FALSE}
> 5 -
+
```
Two options:
* Press `Esc` to exit and *not* execute the line
* Complete the command
---
layout: false
# Files, Plots, Packages, Help
* We will explore this tab more as we get into functions and visualization
* Files is used to browse the files on your computer
* Useful for opening files/data, moving files you are working with
* *Use caution!* Changing files here is the same as changing them on your computer. If you delete something, it's gone!
* Plots are used to display plots you create in R
* Help is used to browse help files of functions. You can explore these by preceding a function name with `?`. Try `?sqrt` to see.
* Packages shows all the packages you currently have installed (we will get more into this later!)
---
class: inverse
.middler[.huge[Brief Intermission: File Organization]]
---
layout:true
# File Names Matter
---
.pull-left[
## Bad
* `newfinal2actualFINALnew.docx`
* `asdfasdf.R`
* `analy$i$ functions!.R`
* `stuff.R`
* Cluttered
* Uninformative
* Spaces
* Special characters other than `_` and `-`
]
.pull-right[
## Good
* `stat302_lab1.Rmd`
* `analysis_functions.R`
* `analysisFunctions.R`
* `2020-01-08_labWriteup.Rmd`
* Meaningful
* Concise
* camelCase or using `_` to distinguish words
* Machine sortable
]
---
## Summary
* Machine readable
* Human readable
* Plays well with default ordering
--
* `01_draft.Rmd`, `02_draft.Rmd` , ... , `11_draft.Rmd`
* `2018-05-05_resume.docx`, `2019-02-17_resume.docx`, `2020-01-08_resume.docx`
---
layout: false
layout: true
# File Organization Matters
---
Easier to start with best practice rather than fix things later!
.middler[]
---
1. Somewhere on your computer, create the folder `STAT302`
2. Within that folder, create the subfolders `short_labs`1, `labs`, `projects`
3. Within your Short Labs folder, create a subfolder `short_lab_1`2
4. Put your both of short lab files from Monday into that folder
5. Within your Labs folder, create a folder for Lab 1 that follows the filename guide
.footnote[[1] or `shortLabs`, `ShortLabs`, `Short_Labs`, ... (just follow the rules for file names!)
[2] or `shortLab1`, `short_lab1`, ...
]
--
.pushdown[May seem excessive for now, but this will come in handy when labs start
including extra files such as data and figures!]
---
layout: false
# All done! For now...
.middler[]
---
layout: true
# R Markdown
---
Let's try making an R Markdown file:
1. Choose *File > New File > R Markdown...*
2. Make sure *HTML Output* is selected and click OK
3. Save the file in your new folder, call it `stat302_Lab1.Rmd`
* *Hint:* Follow along, because this will become your Lab 1 submission!
4. Click the *Knit HTML* button
* After it is done, browse to the file location using the `Files` tab. What do you notice?
* Click *Open in Browser* to view the full HTML
---
## R Markdown Headers
The header of .Rmd files is YAML (YAML Ain't Markup Language) code
5. Change `title` to "Lab 1"
6. Change `author` to your name in quotes
7. Change `date` to the due date in quotes
--
Congrats! You have a functional .Rmd that will soon be your Lab 1 submissions!
---
## R Markdown Syntax
(Thanks to Charles Lanfear, UW Sociology, for this very concise summary)
---
.pull-left[
## Output
**bold/strong emphasis**
*italic/normal emphasis*
.forcehead[Header]
## Subheader
### Subsubheader
]
.pull-right[
## Syntax
**bold/strong emphasis** *italic/normal emphasis* # Header ## Subheader ### Subsubheader] --- .pull-left[ ## Output 1. Ordered lists 1. Are real easy 1. Even with sublists 1. Or when lazy with numbering * Unordered lists * Are also real easy + Also even with sublists [URLs are trivial](http://www.uw.edu)  ] .pull-right[ ## Syntax
1. Ordered lists 1. Are real easy 1. Even with sublists 1. Or when lazy with numbering * Unordered lists * Are also real easy + Also even with sublists [URLs are trivial](http://www.uw.edu) 
You can put some math $y= \left(\frac{2}{3}
\right)^2$ right up in there
$$\frac{1}{n} \sum_{i=1}^{n}
x_i = \bar{x}_n$$
Or a sentence with `code-looking font`.
Or a block of code:
```
y <- 1:5
z <- y^2
```
]