---
title: "Lab 01 - Functions and Vectors"
author: "Your Name"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
html_document:
df_print: paged
theme: cerulean
highlight: haddock
self_contained: true
---
```{r include = FALSE}
# CODE CHUNKS
# This is a "code chunk" - R evaluates the code inside and prints results
# These "chunks" begin with "```{r}" and end with "```"
# Inside {r}, we can modify the chunk behavior
# Above, we tell R not to "include" this chunk
# GLOBAL OPTIONS
# Typically, the first "chunk" tells R how to create your report
# Below, we tell R to repeat code and disable messages/warnings
knitr::opts_chunk$set(echo = TRUE,
message = FALSE,
warning = FALSE)
# KNITTING
# "Knitting" a .RMD file is essentially turning it into a data product
# Even without your answers, this is ready to knit!
# Click on "Knit" in the upper-left corner to turn this into a report
# Notice that this "chunk" will not appear in your knitted report!
```
# Source Data
The following report analyzes tax parcel data from Syracuse, New York (USA).
View the "Data Dictionary" here:
[**Syracuse City Tax Parcel Data**](http://data.syrgov.net/datasets/1b346804e1364a5eb85ccb53302e3c91_0)
# Importing the Data
The following code imports the Syracuse, NY tax parcel data using a URL.
```{r cache = TRUE}
url <- paste0("https://raw.githubusercontent.com/DS4PS/Data",
"-Science-Class/master/DATA/syr_parcels.csv")
dat <- read.csv(url,
strings = FALSE)
```
# Previewing the Data
There are several exploratory functions to better understand our new dataset.
We can inspect the first 5 rows of these data using function `head()`.
```{r}
head(dat, 5) # Preview a dataset with 'head()'
```
### Listing All Variables
Functions `names()` or `colnames()` will print all variable names in a dataset.
```{r}
names(dat) # List all variables with 'names()'
```
### Previewing Specific Variables
We can also inspect the values of a variable by extracting it with `$`.
The extracted variable is called a "vector".
```{r}
head(dat$owner, 10) # Preview a variable, or "vector"
```
### Listing Unique Values
Function `unique()` helps us determine what values exist in a variable.
```{r}
unique(dat$land_use) # Print all possible values with 'unique()'
```
### Examining Data Structure
Function `str()` provides an overview of total rows and columns (dimensions),
variable classes, and a preview of values.
```{r}
str(object = dat,
vec.len = 2) # Examine data structure with 'str()'
```
# Questions & Solutions
**Instructions:** Provide the code for each solution in the following "chunks".
*Remember to modify the text to show your answer in human-readable terms.*
## Question 1: Total Parcels
**Question:** *How many tax parcels are in Syracuse, NY?*
**Answer:** There are **[X]** tax parcels in Syracuse, NY.
```{r}
# Use an exploratory function like 'dim()', 'nrow()', or 'str()'
```
## Question 2: Total Acres
**Question:** *How many acres of land are in Syracuse, NY?*
**Answer:** There are **[X]** acres of land in Syracuse, NY.
```{r}
# Pass a numeric variable to function 'sum()', with argument 'na.rm = TRUE'
```
## Question 3: Vacant Buildings
**Question:** *How many vacant buildings are there in Syracuse, NY?*
**Answer:** There are **[X]** vacant buildings in Syracuse, NY.
```{r}
# Pass a numeric variable to function 'sum()', with argument 'na.rm = TRUE'
```
## Question 4: Tax-Exempt Parcels
**Question:** *What proportion of parcels are tax-exempt?*
**Answer:** **[X]%** of parcels are tax-exempt.
```{r}
# Pass a logical ('TRUE' or 'FALSE') variable to function 'mean()', with argument 'na.rm = TRUE'
```
## Question 5: Neighborhoods & Parcels
**Question:** *Which neighborhood contains the most tax parcels?*
**Answer:** **[X]** contains the most tax parcels.
```{r}
# Pass the appropriate variable to function 'table()'
# Optional: Use additional functions to narrow your results
```
## Question 6: Neighborhoods & Vacant Lots
**Question:** *Which neighborhood contains the most vacant lots?*
**Answer:** **[X]** contains the most vacant lots.
```{r}
# Pass two variables to function 'table()', separated by a comma
# (Optional) use additional functions to narrow your results
```