# Worksheet 01A: Intro to R, R Markdown, and Reproducibility
**Version 2.0**


## Welcome to STAT545!

We hope that you are excited to become an R pro during the next few weeks! These in-class worksheets have been designed to help you navigate this R journey. We'll start easy, with some examples of R commands, and evolve to the more complex - and arguably cooler - syntax and structures of the R language.

## An important note

**Submission of this worksheet is optional**. Future worksheets **must** be submitted for participation marks. 

Lectures will mostly involve going through these worksheets and giving you the answers (yes, before the deadline). We suggest going through the worksheets before coming to class so that you can find out where you get stuck.

## Attributions

This document was primarily put together by Icíar Fernández Boyano. 

The following resources were used as inspiration in the creation of this worksheet:

+ [Swirl R Programming Tutorial](https://swirlstats.com/scn/rprog.html)
+ [A (very) short introduction to R](https://github.com/ClaudiaBrauer/A-very-short-introduction-to-R/blob/master/documents/A%20(very)%20short%20introduction%20to%20R.pdf)
+ [Happy Git and GitHub for the useR](https://happygitwithr.com/)
+ [2019 STAT545 Guidebook](https://stat545guidebook.netlify.app/index.html)
+ [Jenny Bryan's STAT545 Guidebook](https://stat545.com/)

## Getting Started

Load the required add-on packages for this assignment by running the following code chunk (or _cell_). In Jupyter, you can load the chunk by clicking on the chunk, and clicking the "Run" button (keyboard shortcut: Command + Enter on a Mac, or Control + Enter on Windows). _If this fails, read on..._

In [2]:
library(testthat)
library(digest)

Did that fail for you? It could be that you don't have those packages installed. The following code chunk has been unlocked for you (did you notice that you weren't able to edit the above cells?), so you can use it to install these packages, or to generally just give you the flexibility to start this worksheet with some of your own code. To install the "testthat" package, execute the command `install.packages("testthat")`; what would you need to type to install the "digest" package?

**Please be sure to remove any `install.packages` commands after you've run them**: once you've successfully _installed_ a package with `install.packages`, the package is permanently installed on your computer. 

To _load_ the packages for use in this R session, try executing the above code chunk again.

In [None]:
# An unlocked code chunk.

In Episode 01A of the [STAT 545 video series](https://www.youtube.com/channel/UCrB-uourf2vxGeBnGjQrA0w), RStudio was mentioned as being an IDE for R. You're probably viewing this worksheet in another IDE called **jupyter**. We're using jupyter for the STAT 545 worksheets because it works well with an autograder called nbgrader.

## Getting Familiar with R

### 1.1 Calculator

In its simplest form, R can be used as a interactive calculator. 

In [6]:
10 + 4 # you can add 
10 - 4 # subtract
4 / 2 # divide
2 * 5 # multiply
3 ^ 4 # and exponentiate

Now, what if you need to compute a longer expression? Let's say that I want to find out the percentage of students in the STAT department that are taking STAT545A (note: these numbers are fictional!). I could compute this in several steps, or use a more complex expression. 

**Using multiple steps...**

+ To calculate the number of students in the STAT department, I add 375 new students that have enrolled this year, to the 2000 that were already enrolled.

In [9]:
2000 + 375

+ There are 82 students taking STAT545A this year. Last year, there was the same number of students, but 3 dropped the course after the first two weeks. Let's hypothesise that only 1 will drop the course this year - although I hope the real number is 0 :)

In [10]:
82 - 1

+ With the number of students taking STAT545 this year (hypothetically), and the number of students currently in the STAT department, I can now calculate what percentage of students in the STAT department are taking this class.

In [11]:
81 / 2375

In [12]:
0.03410526 * 100

**What if we use a single expression?**

It seems that around 3% of students in the STAT department are taking STAT545A... but that took *a lot* of steps to calculate. We could also write it like this to save some time:

In [13]:
(82 - 1) / (2000 + 375) * 100 

As you can see, *taking care of precedence rules* (i.e. using brackets appropriately), we can save some time by writing a single expression.

Your turn! Can you calculate the percentage of your life that you have spent in university? 

Compute the difference between 2020 and the year that you started university, and divide this by the difference between 2020 and the year that you were born. Multiply this with 100 to get the percentage of your life that you have spent in university. Your *challenge* here is to use a single expression.

In [14]:
# your code here

### 1.2 Variables
 
Alright, R as a calculator works just fine... but you don't learn a programming language *only* to compute arithmetic expressions. What if you want to use your result from above in a second calculation? Instead of retyping your expression every time that you need it, or copying and pasting the result, you can simply create a new variable that stores it. 

Earlier, I figured out that I had spent 18% of my life at university. I want to assign this value to a variable called `life_university`, which will help me remember what my value means. The way you assign a value to a variable in R is by using the assignment operator, which is just a "less than" symbol, followed by a minus sign. It looks like this:

In [15]:
life_university <- 18

Now, the variable `life_university`, stores the value 18, which is the percentage of time that I had spent at university. But prior to saving this into a variable, I had to calculate the value separately. What if I directly assigned the arithmetic expression that I used to compute my value to the variable?

In [16]:
life_university <- (2020 - 2016) / (2020 - 1998) * 100

Notice that R did not print the result of my expression this time. When you use the assignment operator, R assumes that you don't want to see the result immediately, but rather that you intend to use it for something else later on. 

To view the contents of the variable, you simply have to type the name of the variable - in this case, `life_university` and press Enter. Try it below!

In [1]:
# your code here

**QUESTION 1.0**

Now, it's your turn to store the percentage of time that **you** have spent at university into a variable - try typing the arithmetic expression that you used to compute that value, rather than the value itself! Name this variable `my_life_university` in the first cell below, and check whether the answer is acceptable by running the second cell below. If the test cell gives you an error, try a different answer!

```
my_life_university <- FILL_THIS_IN / FILL_THIS_IN * 100
```

In [18]:
### BEGIN SOLUTION
my_life_university <- 12 / 33 * 100  # Any number between 0 and 100.
### END SOLUTION

In [19]:
test_that("Question 1.0", {
  expect_gte(my_life_university, 0)
  expect_lte(my_life_university, 100)
})
print("success!")

[1] "success!"


### 1.3 Data structures

Any object that contains data is called a data structure.

### 1.3.1 Vectors

#### Numeric vectors

So far, you've learned how to use R as a calculator, and how to use variables to store numeric values. But in reality, a "variable" in R is just a way to name your data so that R can recall it later. Think of it as a label that you put on a box, so that you remember the contents that are inside it. 

The variable that you created above, `my_life_university`, stores the most basic data structure in R programming language: a vector. Even a single number is considered a vector of length one, which is the case with the vector that was assigned to `my_life_university`. Let's have a look again:

In [20]:
my_life_university

In this way, you can think of the vector as the data structure, and the variable as a label. But what if you want a vector that's greater than length one, or in other words, that stores more than a single numeric value? The easiest way to create a vector is using `c()`, which stands for "concatenate", or "combine". 

**QUESTION 1.1**

Let's give it a try. To create a vector containing the numbers 3.14, 2.71, and 6.28, type `c(3.14, 2.71, 6.28)`. Store the result in a variable called `x`. 

```
x <- FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN, FILL_THIS_IN)
```

In [21]:
### BEGIN SOLUTION
x <- c(3.14, 2.71, 6.28)
### END SOLUTION

In [22]:
test_that("Question 1.1", {
  expect_equal(digest(x), "d696b13d28ab63409f1f528a2d37bb0e")
})
print("success!")

[1] "success!"


Now, type `x` and press Enter to view its contents. Notice that there are no commas separating the values in the output!

In [23]:
# your code here

You can combine several vectors to make a new vector. And here is where things get fun! For the sake of seeing the result immediately, we won't store this combined vector in a new variable for now.

In [24]:
c(10, 50)

And what's more: you can combine any numeric vectors together, regardless of whether they have already been assigned to a variable or not!

In [25]:
c(x, 50)

**QUESTION 1.2**

Your turn to give it a try. Create a new vector that contains `life_university`, `my_life_university`, and `25`. Store your result in a variable named `answer1.2`

```
answer1.2 <- FILL_THIS_IN(FILL_THIS_IN, FILL_THIS, FILL_THIS_IN)
```

In [26]:
### BEGIN SOLUTION
answer1.2 <- c(life_university, my_life_university, 25) 
### END SOLUTION

[1] "success!"


In [27]:
test_that("Question 1.1", {
  expect_identical(answer1.2[1L], life_university)
  expect_identical(answer1.2[2L], my_life_university)
  expect_equal(answer1.2[3L], 25)
})
print("success!")

One more cool thing before we go on: numeric vectors can be used in arithmetic expressions. Remembering the vector that we created earlier and assigned to the variable `x`? Let's have a look at it again.

In [28]:
x

**QUESTION 1.3**

Here's a fun fact: those three numbers are actually pi, euler's number, and tau. But that's a story for another course! Type the following to see what happens: `x * 2 + 100`... Actually, **wait!** What do **you** think will be the result of doing that?

1: a vector of length three
2: a single number (a vector of length 1)
3: a vector of length 0 (i.e an empty vector)

Assign your answer (`1`, `2`, or `3`) to a variable named `answer1.3`.

In [29]:
#answer1.3 <- youranswer
### BEGIN SOLUTION
answer1.3 <- 1
### END SOLUTION

In [30]:
test_that("Answer check", {
  expect_identical(
    digest(as.integer(answer1.3)), 
    "4b5630ee914e848e8d07221556b0a2fb"
  )
})
print("success!")

[1] "success!"


Let's see what actually happens. Type `x * 2 + 100` and press Enter.

In [31]:
# your code here

First, R multiplied each of the three elements in `x` by 2. Then, it added 100 to each element to get the result that you see.

#### Logical vectors

So far we have only dealt with **numeric** vectors. But there are other types of vectors in the R universe. Let's have a look.

**QUESTION 1.4**

Enough of university, let's talk about vacation! A group of friends are discussing the places that they visited in 2019, and trying to figure out how much total vacation time each of them took. Pablo says he took 54 days off to travel locally, Dana was on vacation for only 14 days, and Marianne went to the Caribbean for 30 days.

Create a vector that contains the values of Pablo, Dana, and Marianne's vacation days, respectively. Assign it to a variable named `vacation_time`.

```
vacation_time <- FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN, FILL_THIS_IN)
```

In [32]:
### BEGIN SOLUTION
vacation_time <- c(54, 14, 30)
### END SOLUTION

In [33]:
test_that("Answer check", {
  expect_identical(
    digest(as.integer(vacation_time)), 
    "8336872ae5cc234b1c1574e27d863ebb"
  )
})
print("success!")

[1] "success!"


**QUESTION 1.5**

Which person was on vacation for more than 21600 minutes? First, create a numeric vector that multiplies the `vacation_time` vector by 1440 (the number of minutes in a day), to find out what each person's vacation time is *in minutes*. Assign this to a variable named `vacation_time_minutes`.

```
vacation_time_minutes <- FILL_THIS_IN * FILL_THIS_IN
```

In [34]:
### BEGIN SOLUTION
vacation_time_minutes <- vacation_time * 1440
### END SOLUTION

In [35]:
test_that("Answer check", {
    expect_identical(
      digest(as.numeric(vacation_time_minutes)),
      "ce79c61a9b5bd2b5bf4b4def95455438"
    )
})
print("success!")

[1] "success!"


**QUESTION 1.6**

Now, create a variable called `under_21600` that gets the result of `vacation_time_minutes > 21600`, which is read as 'vacation_time_minutes is more than 21600'.

```
under_21600 <- FILL_THIS_IN > FILL_THIS_IN
```

In [36]:
### BEGIN SOLUTION
under_21600 <- vacation_time_minutes > 21600
### END SOLUTION

In [37]:
test_that("Answer check", {
    expect_identical(
      digest(under_21600),
      "4f00878a54c541bdbf07c006a9d412dc"
    )
})
print("success!")

[1] "success!"


Have a look at the output of `under_21600` by typing the name and pressing Enter.

In [38]:
# your code here

Congratulations! You've created your first **logical vector**. Logical vectors can contain the values `TRUE`, `FALSE`, and `NA` (for 'not available' - this happens when you have missing data!). These values are generated as the result of logical 'conditions'. We have seen the logical operator "greater than" in this activity, but there are [many more](https://www.statmethods.net/management/operators.html), such as "less than", "exactly equal to", or "not equal to". Don't worry, there will be plenty of time to use those in the future!

#### ...and more

There are other types of vectors out there in the R universe, such as character vectors. We won't get into the nitty gritty of these - logical and numeric are the most basic R vectors, the ones that you absolutely need to know & that we will use most often. However, we didn't want to leave you in the dark about these other types of vectors! If you really, really want to know more, you can read [more about vectors](https://r4ds.had.co.nz/vectors.html) here.

Anyway, here is a handy tip! If you ever come across a vector and you're not sure what it is, you can inspect its two key properties: type, and length. Here is an example of how you would do it. *"Double" is just a type of numeric vector.*

In [39]:
typeof(x)
length(x)

### 1.3.2 Dataframes

Living in a vector-only world would be nice if all data analyses involved one variable. When we have more than one variable, data frames come to the rescue. Basically, a data frame holds data in tabular format. R has some data frames "built in". For example, motor car data is attached to the variable name mtcars.

Print `mtcars` to screen. If I haven't mentioned before, "print" means to type the name of the object, and press Enter -- which is the same as surrounding the object with the `print()` function. Notice the tabular format.

In [40]:
mtcars
print(mtcars)

Unnamed: 0_level_0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360.0,175,3.15,3.44,17.02,0,0,3,2
Valiant,18.1,6,225.0,105,2.76,3.46,20.22,1,0,3,1
Duster 360,14.3,8,360.0,245,3.21,3.57,15.84,0,0,3,4
Merc 240D,24.4,4,146.7,62,3.69,3.19,20.0,1,0,4,2
Merc 230,22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2
Merc 280,19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4


                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0   

We will talk more about dataframes in just a bit, but for now, just keep in mind that they are one of the most used data structures in R - albeit more complex than vectors.

### 1.4 Subsetting

Often, when you're working with a large dataset (such as `mtcars`), you will only be interested in a small portion of it. Even when working with a simpler data structure, such as vectors, you may want to extract a particular value that you are interested in. R has several ways of doing this, in a process that it calls "subsetting". Subsetting dataframes is definitely a more complex task - we will start little, with vectors.

A student from a previous STAT545 cohort tracked his commute times for two weeks (10 days), and saved them in a vector that he stored in the variable `times`. Here is the `times` variable. 

In [41]:
times <- c(18, 22, 43, 26, 75, 31, 32, 17, 16, 51)

We use `[]` to subset the vector of times. Although we had a look at this in class, here are a couple examples to refresh your memory. To extract the first entry of a vector:

In [42]:
x[1]

And if I want to extract everything *but* the first entry:

In [43]:
x[-1]

You're doing a great job! Now, it's your turn to use `[]` to subset the vector of times. Keep it up!

**QUESTION 1.7**

Extract the third entry of the `times` vector, and store the result in a variable named `answer1.7`.

```
answer1.7 <- FILL_THIS_IN[FILL_THIS_IN]
```

In [44]:
### BEGIN SOLUTION
answer1.7 <- times[3]
### END SOLUTION

In [45]:
test_that("Answer check", {
    expect_identical(
      digest(answer1.7),
      "e3aac2c171de0322895102f09101ba98"
    )
})
print("success!")

[1] "success!"


**QUESTION 1.8**

Extract everything in `times` except the third entry. Store the result in a variable named `answer1.8`.

```
answer1.8 <- FILL_THIS_IN[-FILL_THIS_IN]
```

In [46]:
### BEGIN SOLUTION
answer1.8 <- times[-3]
### END SOLUTION

In [47]:
test_that("Answer check", {
    expect_identical(
      digest(answer1.8),
      "600c1ff6db302a52139f9ac39dd41d0c"
    )
})
print("success!")

[1] "success!"


**QUESTION 1.9**

Extract the second and fourth entry of `times`, and store it in a variable called `answer1.9a`. Extract the fourth and second entry of `times`, and store it in a variable called `answer1.9b`. *Hint: remember `c()`?*

```
answer1.9a <- FILL_THIS_IN[FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN)]
answer1.9b <- FILL_THIS_IN[FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN)]
```

In [48]:
### BEGIN SOLUTION
answer1.9a <- times[c(2, 4)]
answer1.9b <- times[c(4, 2)]
### END SOLUTION

In [49]:
test_that("Answer check", {
  expect_identical(
    digest(answer1.9a),
    "24887cb43232d541bb551ce34f852e69"
  )
  expect_identical(
    digest(answer1.9b),
    "94001bedd89d74d064e93afdf1b57986"
  )
})
print("success!")

[1] "success!"


**QUESTION 1.10**

Extract the second through fifth entry of `times` – make use of `:` to construct sequential vectors. Store the result in a variable named `answer1.10`.

```
answer1.10 <- FILL_THIS_IN[FILL_THIS_IN:FILL_THIS_IN]
```

In [50]:
### BEGIN SOLUTION
answer1.10 <- times[2:5]
### END SOLUTION

In [51]:
test_that("Answer check", {
    expect_identical(
      digest(answer1.10),
      "3dff0beb6577b621859c9a3579b8d379"
    )
})
print("success!")

[1] "success!"


**QUESTION 1.11**

Extract all entries of `times` that are less than 30 minutes, and place the result in a variable named `answer1.11`. Why does this work? Logical subsetting!

```
answer1.11 <- FILL_THIS_IN[FILL_THIS_IN < FILL_THIS_IN]
```

In [54]:
### BEGIN SOLUTION
answer1.11 <- times[times < 30]
### END SOLUTION

In [55]:
test_that("Answer check", {
    expect_identical(
      digest(answer1.11),
      "547b9ded5983c354d5684dbfa0909ceb"
    )
})
print("success!")

[1] "success!"


**QUESTION 1.12**

After all of that, did the `times` object change at all?

Multiple Choice:

A) yes

B) no

C) not sure

Store your answer (e.g. the letter corresponding to the correct option) in an object called `answer1.12`.

```
answer1.12 <- FILL_THIS_IN
```

In [17]:
#answer1.12 <- ...
### BEGIN SOLUTION
answer1.12 <- "B"
### END SOLUTION

In [16]:
test_that("Answer check", {
    expect_identical(
      digest(toupper(as.character(answer1.12))),
      "3a5505c06543876fe45598b5e5e5195d"
    )
})
print("success!")

[1] "success!"


**QUESTION 1.13**

This is a bit of challenge, but I bet you can do it. Try using `[]` in conjunction with `<-` to change the `times` objects by replacing the 2nd and 3rd entries with 2 new travel times of your choosing.

(Before you do that, allow us to store the original `times` object for autograding!)

In [45]:
times_old <- times

Now, answer away!

```
FILL_THIS_IN[FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN)] <- FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN)
```

In [46]:
### BEGIN SOLUTION
times[c(2, 3)] <- c(10, 12) # as an example, your answer could be different!
### END SOLUTION

In [47]:
# Test that `times` still has length 10, and that the entries are the
# same except for the 2nd and 3rd.
test_that("Answer check", {
  expect_identical(length(times), 10L)
  expect_identical(times_old[-c(2, 3)], times[-c(2, 3)])
  expect_true(times_old[2] != times[2])
  expect_true(times_old[3] != times[3])
})
print("success!")

[1] "success!"


### 1.6 Functions

Functions are one of the fundamental building blocks of the R language. They are small pieces of reusable code that can be treated like any other R object. Functions are easily recognizable because they are usually characterized by their name followed by parenthesis. For example, if there was a function that could make bread, it would look like this: bread().

**QUESTION 1.14**

You have actually already used 3 functions in this worksheet before being formally introduced to what a function is. Can you recall if any of these functions have been used in this worksheet already?

A) `c()`

B) `mean()`

C) `typeof()`

D) `length()`

**Hint:** More than 1 answer may be correct -- make a vector of all of the correct ones!

```
answer1.11 <- c("FILL_THIS_IN", "FILL_THIS_IN", ...)
```

In [34]:
# answer1.11 <- youranswer
### BEGIN SOLUTION
answer1.11 <- c("A", "C", "D")
### END SOLUTION

In [35]:
test_that("Answer check", {
    expect_identical(
      digest(sort(toupper(as.character(answer1.11)))),
      "82baea6f032c2c9fa74e85f8b379f021"
    )
})
print("success!")

[1] "success!"


There are tens of thousands of functions that one can use in R, which seems a bit large for this worksheet. Let's explore a few basic functions just for fun. Type `Sys.Date()` below to see what happens!

In [None]:
# your code here

Remember that there are different types of vectors, besides numeric and logical? Well, the output of `Sys.Date()` is actually an example of another vector type, known in R language as a "string". A "string" is just a character (any value written within a pair of single or double quotes in R) variable that contains one or more characters! 

The value that `Sys.Date()` computes is based on your computer's environment, but functions in R can also manipulate input data in order to compute a return value. At the start of this worksheet, you were introduced to the simplest form of R - as a calculator. Actually, R functions allow us to compute certain things that could be done manually as a calculator, but much faster.

Recall the `times` vector earlier. What's the average travel time? Instead of computing this manually, we can use a function called `mean`. 

In [50]:
mean(times)

Notice the syntax of using a function: starting by the left with the *function name*, and the *input* goes inside brackets. We *input* times, and we got an *output*. Did this function change the *input*? Check:

In [None]:
# your code here

**QUESTION 1.15**

Aside from bizarre functions, this is always the case. But functions don't always return a single value. Try the `range()` function (assigning the result to `answer1.15a`), and the `sqrt()` function (assigning the result to `answer1.15b`), using the `times` vector as an argument for both.

```
answer1.15a <- FILL_THIS_IN(FILL_THIS_IN)
answer1.15b <- FILL_THIS_IN(FILL_THIS_IN)
```

In [51]:
### BEGIN SOLUTION
answer1.15a <- range(times)
answer1.15b <- sqrt(times)
### END SOLUTION

In [52]:
test_that("Answer check", {
  expect_identical(answer1.15a, range(times))
  expect_identical(answer1.15b, sqrt(times))  
})
print("success!")

[1] "success!"


Functions can also take more than one argument as input, separated by commas. You can find out what these arguments are by accessing the function’s documentation, which you can do by executing `?"function name"`. Try accessing the documentation of the `mean()` function by executing `?mean`.

In [None]:
# your code here

There are four arguments. All the arguments have names, except for the `...` argument (more on `...` -- ellipses -- later). This is always the case.

Under "Usage", some of the arguments are of the form `name = value`. These are default values, in case you don't specify these arguments. This is a sure sign that these arguments are optional.

`x` is "on its own". This typically means that it has no default, and often (but not always) means that the argument is required. We can specify an argument in one of two ways:

+ specifying argument `name = value` in the function parentheses; or
+ matching the ordering of the input with the ordering of the arguments.

For readability, this is not recommended beyond the first or sometimes second argument! 

**QUESTION 1.16**

Try executing `mean()` again with `times` as an argument, but this time, set the `na.rm` to `TRUE`. Store the result in a variable named `answer1.16`.

```
answer1.16 <- FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN = FILL_THIS_IN)
```

In [53]:
### BEGIN SOLUTION
answer1.16 <- mean(times, na.rm = TRUE)
### END SOLUTION

In [54]:
test_that("Answer check", {
    expect_identical(answer1.16, mean(times, na.rm = TRUE))
})
print("success!")

[1] "success!"


**QUESTION 1.17**

The mean is the same, because there are no `NA` values in the vector `times`. Put your subsetting knowledge into practice by replacing the third entry of the `times` vector by a missing value (`NA`).

**Hint**: Your solution starts with `times`.

```
FILL_THIS_IN[FILL_THIS_IN] <- NA
```

In [36]:
### BEGIN SOLUTION
times[3] <- NA
### END SOLUTION

ERROR: Error in times[3] <- NA: object 'times' not found


In [56]:
test_that("Answer check", {
    expect_identical(which(is.na(times)), 3L)
})
print("success!")

[1] "success!"


**QUESTION 1.18**

Now, try executing `mean()` specifying `na.rm` as `TRUE` again (with `times` as an input). Store the output in a variable named `answer1.18`.

```
answer1.18 <- FILL_THIS_IN(FILL_THIS_IN, FILL_THIS_IN = FILL_THIS_IN)
```

In [57]:
### BEGIN SOLUTION
answer1.18 <- mean(times, na.rm = TRUE)
### END SOLUTION

In [58]:
test_that("Answer check", {
    expect_identical(answer1.18, mean(times, na.rm = TRUE))
})
print("success!")

[1] "success!"


Notice how the output changes. What if you try setting `na.rm` as `FALSE` instead?

In [None]:
# your code here