---
title: "Week One"
subtitle: "Data strutures in R"
output:
powerpoint_presentation
---
```{r echo=FALSE}
library(tidyverse)
```
# Orientation
## Package stats for Bioconductor
```{r echo=TRUE}
read_tsv("https://bioconductor.org/packages/stats/bioc/bioc_pkg_stats.tab") %>%
filter(Year != 2020) %>%
select(Year, Nb_of_downloads) %>%
group_by(Year) %>%
summarise(sum = sum(Nb_of_downloads)) %>%
ggplot(mapping = aes(x = Year, y = sum)) +
geom_path() +
scale_x_continuous(breaks = 2009:2019) +
theme_classic(base_size = 18) +
labs(x = NULL, y = "Downloads", title = "Bioconductor package downloads over time")
```
## Slide with Plot
```{r pressure, fig.asp=0.618}
par(mar=c(4,4,0.1,0.1))
plot(pressure)
```
# Simple Data Structures
## Characters
Characters hold "string" data. For example:
```{r}
"Hello world"
```
## Numbers
Numbers can be integers or numerics (i.e., floats) in R. For example:
```{r}
1
2.2
3E100
```
## Logicals
Logicals are either TRUE or FALSE. For example:
```{r}
TRUE
FALSE
```
## Data Structure Classes
We can identify the "type" or "class" of an object in R with the `class` function:
```{r}
class(1)
class("Hello World!")
class(TRUE)
```
## Data Structure Classes cont.
We can also use the `str` function to see the "structure" of any R object:
```{r}
str(1)
str("Hello World!")
str(TRUE)
```
## Comments
An important tool for writing R code is comments. These are preceded by `#` and are ignored by R.
```{r}
# This code will say "Hello World!"
"Hello world!"
```
Comments are a helpful tool for conveying information about your code to others.
# Simple Data Structure Methods
## The print Method
The `print` method works on any data type in R:
```{r}
print(1)
print("Hello World!")
print(TRUE)
```
\* Notice that this is the same as simply typing the data into the R console and hitting \
## Arithmetic Methods
Arithmetic Methods for numeric data types:
```{r}
# Addition
1 + 1
# Subtraction
1 - 1
# Multiplication
2 * 2
# Division
10 / 2
```
See the cheat sheet for the full list.
## Equivalence Comparisons
Equivalence comparisons are a way to check if any two objects are the same.
```{r}
# Does 1 equal 1?
1 == 1
# Does "Hello" equal "World"?
"Hello" == "World"
```
## Equivalence Comparisons cont.
Equivalence comparisons can be inverted to check if two objects are not equal.
```{r}
# Does 1 not equal 1?
1 != 1
# Does "Hello" not equal "World"?
"Hello" != "World"
```
## Mathematical Comparisons
For numeric data types, mathematical comparisons can also be made.
```{r}
# Is 1 less than 100?
1 < 100
# Is 2 + 2 greater than 2 ^ 2?
2 + 2 > 2 ^ 2
# Is 2 + 2 greater than or equal to 2 ^ 2?
2 + 2 >= 2 ^ 2
```
## Variables
Variables hold objects which are assigned to them.
```{r}
a <- 1
a
```
## Variables cont.
Variables are identical to the object assigned to them.
```{r}
a <- 1
b <- a
# Does b equal 1?
b == 1
```
## Variables cont.
Variables enable complex operations on data.
```{r}
h <- 2 ^ 100
i <- h / 3E100
j <- 1E5
k <- j ^ (-1 * i)
k < 1
```
# Complex Data Structures
## Vectors
A vector is an ordered collection of either numerics, characters, or logicals.
```{r}
num_vec <- c(1, 2, 3)
num_vec
char_vec <- c("Hello", "World", "!")
char_vec
log_vec <- c(TRUE, FALSE, FALSE)
log_vec
```
## Vectors cont.
Vectors can also have a vector of names which describe each element.
```{r}
grades <- c(98, 95, 82)
names(grades) <- c("Jimmy", "Alice", "Susan")
grades
```
## Vectors cont.
Elements from a vector can be accessed using the index of the desired data.
```{r}
fruits <- c("apple", "banana", "orange")
fruits[2]
```
## Vectors cont.
Elements from a vector can be accessed using the name of the desired element.
```{r}
grades <- c(98, 95, 82)
names(grades) <- c("Jimmy", "Alice", "Susan")
grades["Alice"]
```
## Vectors cont.
Numeric shortcut for getting a vector of integers:
```{r}
my_ints <- 1:10
my_ints
```
## Lists
A list is an ordered collection of any objects.
```{r}
my_list <- list(1, "b", TRUE, c(1, 2, 3))
my_list
```
## Lists cont.
Lists can also have names.
```{r}
my_class <- list(c("Jimmy", "Alice", "Susan"),
c(98, 95, 82))
names(my_class) <- c("Students", "Grades")
my_class
```
## Lists cont.
Lists can be accessed using numeric indexes or by element name.
```{r}
my_class <- list(c("Jimmy", "Alice", "Susan"),
c(98, 95, 82))
names(my_class) <- c("Students", "Grades")
my_class[[1]]
my_class[["Grades"]]
```
## Data Frames
Data Frames are similar to excel sheets. They are 2D arrays which can hold numeric,
character, and boolean data. They also have column names.
```{r}
my_df <- data.frame(
"Students" = c("Jimmy", "Alice", "Susan"),
"Grades" = c(98, 95, 82)
)
my_df
```
## Data Frames cont.
Data Frames can be accessed numerically by specifying the row and column of interest.
```{r}
my_df <- data.frame(
"Students" = c("Jimmy", "Alice", "Susan"),
"Grades" = c(98, 95, 82)
)
# What grade did Susan get?
my_df[3, 2] # [row, column]
```
## Data Frames cont.
Data Frames can also be accessed by column name with the "$" sign.
```{r}
my_df <- data.frame(
"Students" = c("Jimmy", "Alice", "Susan"),
"Grades" = c(98, 95, 82)
)
# Access the Grades Column
my_df$Grades
# What grade did Alice get?
my_df$Grades[2]
```
# Complex Data Structure Methods
## Equivalence