---
title: "Short Lab 3"
author: "INSERT YOUR NAME HERE"
date: "Due Date Here"
output: html_document
---

<!--- Begin styling code. --->
<style type="text/css">
/* Whole document: */
body{
  font-family: "Palatino Linotype", "Book Antiqua", Palatino, serif;
  font-size: 12pt;
}
h1.title {
  font-size: 38px;
  text-align: center;
}
h4.author {
  font-size: 18px;
  text-align: center;
}
h4.date {
  font-size: 18px;
  text-align: center;
}
</style>
<!--- End styling code. --->


As usual, all code below should follow the style guidelines from the lecture slides.

## Part 1. Read in text data

For this short lab, we will be using Project Gutenberg’s The Complete Works of William Shakespeare. 
Use the command `read_lines()` to read the text available at
"https://www.gutenberg.org/files/100/100-0.txt".
Make sure to store the text as a variable.

**1a.** Print the first 5 lines.

**1b.** Print the total number of lines.

**1c.** Remove all empty lines, then print the total number of lines.

(*Hint: to remove empty elements from a string vector x, you could use* `x <- x[x != ""]`)

## Part 2. Regular expressions

**2a.** Use a regular expression with `str_count()` to count how much punctuation is in this text file, in total.

**2b.** Use a regular expression with `str_detect()` to count how many lines contain *either* the string "Romeo" or "Juliet".

## Part 3. String Manipulation

**3a.** Use `str_c()` to collapse the Shakespeare string vector into one large string. (Don't try to print it!)

**3b.** Use `str_split()` to separate your string into words.

(*Hint: you might get a list of length 1 that you have to convert to a vector. You could do this by using something like* `x <- unlist(x)` *or* `x <- x[[1]]`)

**3c.** Use a combination of `table()` and `sort(..., decreasing = TRUE)` argument to get a count of the unique words in Shakespeare's complete works and print out the 10 most common words.

## Part 4. Factors

**4a.** Create an object that is a factor vector with 4 levels, where each of these levels is observed at least once.

**4b.** Collapse two of your factor levels together into a new level "x".

**4c.** Add a new, empty level to your factor and print out the vector.

**4d.** Remove this empty level from your factor and print out the vector.

## Part 5. Dates

**5a.** Create a date-time object in R, with both a date and a time.

**5b.** Extract the date from your object.

**5c.** Extract the month from your object.

**5d.** Change the hour of your object, then extract the hour from your object.