--- title: 'CSSS 508: Homework 4' author: "YOUR NAME" date: "DATE" output: html_document: toc: true toc_float: true toc_collapsed: true toc_depth: 3 number_sections: true theme: lumen --- ```{r setup, include=FALSE} # DO NOT EDIT ANY PART OF THIS CHUNK # Do examine the code if you're interested how it works! knitr::opts_chunk$set(echo = TRUE, warning=TRUE, message=TRUE) library(tidyverse) set.seed(7) test_x <- function(x=NULL, class="omit", dims="omit", numeric="omit", nam_pos="omit", nam_val="omit", c_pos="omit", c_val="omit", na_check="omit", all_NA="omit"){ if(is.null(x)){ return(message("No code entered for ", deparse(substitute(x)), " yet.\n")) } else { class_x <- ifelse(class!="omit", class(x)==class, TRUE) num_x <- ifelse(numeric!="omit", is.numeric(x)==numeric, TRUE) dim_x <- ifelse(all(dims!="omit"), ifelse(is.null(dim(x)), length(x)==dims, all(dim(x)==dims) ), TRUE) names_x <- ifelse(nam_val!="omit" & nam_pos!="omit", ifelse(is.null(names(x)), FALSE, names(x)[nam_pos]==nam_val), TRUE) if(any(c_val=="omit") | any(c_pos=="omit")){ check_x <- TRUE } else { if (dim_x==FALSE){ check_x <- FALSE } else { if(length(c_pos)==1 & class_x!="list"){ check_x <- x[c_pos]==c_val } else { if(class=="matrix" | class=="data.frame"){ check_x <- x[ c_pos[1], c_pos[2] ]==c_val } else { if(class=="list" & length(c_pos)==2){ check_x <- x[[ c_pos[1] ]][[ c_pos[2] ]]==c_val } else { if(class=="list" & length(c_pos)==3){ check_x <- x[[ c_pos[1] ]][[c_pos[2] ]][[ c_pos[3] ]]==c_val } } } } } } if (all_NA==TRUE){ all_NA_x <- all(is.na(x)) } else { all_NA_x <- TRUE } if (all(na_check!="omit") & is.null(dim(x))){ na_x <- any(is.na(x[na_check])==TRUE)==FALSE } else { if (all(na_check!= "omit") & is.null(dim(x))==FALSE & length(na_check)==2 ){ na_x <- any(is.na(x[na_check[1], na_check[2]])==TRUE)==FALSE } else { if (all(na_check!= "omit") & is.null(dim(x))==FALSE & length(na_check)==1){ na_x <- any(is.na(x[, na_check])==TRUE)==FALSE } else {na_x <- TRUE} } } if (all(c(class_x, dim_x, names_x, num_x, na_x, check_x, all_NA_x)==TRUE)){ message("Good job on ", deparse(substitute(x)), "!\n") } else { message("Oh no, check your code for \"", deparse(substitute(x)), "\"\n") if (all_NA_x==FALSE)message("- All elements should be NA!\n") if (dim_x==FALSE) message("- Wrong length or dimensions!\n") if (class_x==FALSE) message("- Wrong object class!\n") if (names_x==FALSE) message("- Wrong object names!\n") if (num_x==FALSE) message("- Should this be numeric?\n") if (na_x==FALSE) message("- You seem to have data missing!\n") if (check_x==FALSE) message("- You seem to have an incorrect or misplaced value somewhere!\n") } } } ``` # Instructions For this assignment you will practice creating, examining, and combining different data structures in R. This assignment is different from others in that it takes a worksheet format with built-in error checking. Each time you complete an answer, if you knit the document it should check your answer (but don't remove "NULL" on an answer before you want it checked or you'll get an error!) For this reason, you should re-knit every time you answer a question, so that if something goes wrong, you know the last edit caused the issue! You will need to do the following: 1) Fill in **code** as needed in the designated locations in code chunks. + You will not need to create new code chunks or edit chunk options. + Instructions in chunks will be shown as comments (after `#`); *remove* and *replace* the comments with your code: + Where you are assigning things to objects, replace the "NULL" values with your code. See code example 1 for an example! + You may need to add elements, such as for subsetting, to objects on the left side of operators. See code example 2 for an example! + Some answers may require multiple lines of code, so add as necessary. + Do not modify chunks labeled with "`# DO NOT EDIT`"; these are used to test your code. 2) Fill in **text** as block quotes (use `>`) in designated locations to answer questions. + Be brief and specific. If you reference a single value or short (<10 element) vector, use in-line R code. If the answer can be clearly and accurately answered without referencing any values, you can do so. When I bold a word in a question, take it to mean I want you to reference a value or values. + Do not directly reference matrices, lists, or data frames in text (don't display them). They can be big! + You should not *need* to include any code *chunks* in this section, but you can if you need. **Example for Code Questions:** ```{r example_1} # Ex 1) Create a vector of the numbers 1, 5, 3, 2, 4. example <- NULL # You would write this: example <- c(1,5,3,2,4) # Ex 2) Add the number 10 to the sixth position of example2 example2 <- example example2 <- NULL # You would write: example2[6] <- 10 ``` **Example for Text Questions:** 1) "How do vectors differ from lists?" > A vector is a one dimensional object where every element is the same type of data. > Lists are one dimensional but elements can be different data types. # Problems ## Vectors: ### Code with Vectors: ```{r vectors_1} # 1) Use "seq()" to create a vector of numbers from 0 to 45 in increments of 5 vec_num <- NULL # 2) Use ":" to create an integer vector of the numbers 11 through 20. vec_int <- NULL # 3) "LETTERS" contains the 26 capital letters in order. Use "LETTERS" and "[ ]" to create a vector of the last 10 capital letters. vec_cha <- NULL # 4) "letters" contains the 26 lowercase letters in order. Use "factor", "letters", and "[ ]" to create a factor variable using the first ten lower case letters. vec_fac <- NULL # 5) Use "c()" to combine "vec_cha" and "vec_fac" into "vec_let". Do not convert it to a factor! vec_let <- NULL # 6) Use "c()" and "[ ]" to combine the first 4 elements of "vec_num" with the last # 4 elements of "vec_int" to create "vec_ni". vec_ni <- NULL # 7) Use "rev()" to reverse the order of "vec_fac". fac_vec <- NULL ``` *How'd you do?* ```{r vectors_1_check, echo=FALSE, message=TRUE, collapse=TRUE, comment=''} # DO NOT EDIT ANY PART OF THIS CHUNK test_x(vec_num, class="numeric", dims=10, c_pos=2, c_val=5) test_x(vec_int, class="integer", dims=10, c_pos=5, c_val=15) test_x(vec_cha, class="character", dims=10, c_pos=3, c_val="S") test_x(vec_fac, class="factor", dims=10, c_pos=7, c_val="g") test_x(vec_let, class="character", dims=20, c_pos=19, c_val="9") test_x(vec_ni, class="numeric", dims=8, c_pos=5, c_val=17) test_x(fac_vec, class="factor", dims=10, c_pos=4, c_val="g") ``` ### Questions about Vectors: 1) If you used `c()` to combine `vec_int` with `vec_fac`, what *class* of vector would you get? Why? > Answer: 2) Consider this code: ```{r vectors_2, echo=TRUE} new_vec <- c(TRUE, FALSE, TRUE, TRUE) ``` + What **class** of vector is this? + In words, what happens to its values when you try to convert it to numeric? To character? To numeric and then character? + What happens if you convert it first to character and second to numeric? > Answer: ## Matrices: ### Code with Matrices: ```{r matrices_1} # 1) Use matrix() to create a matrix with 10 rows and four columns filled with NA mat_empty <- NULL # 2) Assign "vec_num" to the first column of "mat_1" below. mat_1 <- mat_empty # DO NOT EDIT THIS LINE; add code below it. mat_1 <- NULL # 3) Assign "vec_int" to the second column of "mat_2" below mat_2 <- mat_1 # DO NOT EDIT THIS LINE; add code below it. mat_2 <- NULL # 4) Assign "vec_cha" and "vec_fac" to the third and fourth columns of "mat_3" using one assignment operator. mat_3 <- mat_2 # DO NOT EDIT THIS LINE; add code below it. mat_3 <- NULL # 5) Select the fourth row from "mat_3" and assign it to the object "row_4" as a vector. row_4 <- NULL # 6) Assign the element in the 6th row and 2nd column of "mat_3" to "val_6_2" as a numeric value (using as.numeric). val_6_2 <- NULL # 7) Use "cbind()" to combine "vec_num", "vec_int", "vec_cha", and "vec_fac" into "mat_4". mat_4 <- NULL # 8) Next, first transpose mat_4, then select only the first four columns and assign to mat_t mat_t <- NULL # 9) Then use rbind() to add the rows from mat_3 to mat_t (mat_t first, mat_3 second) and assign this combination to mat_big. mat_big <- NULL ``` *How'd you do?* ```{r matrices_1_test, echo=FALSE, message=TRUE, collapse=TRUE, comment=''} # DO NOT EDIT ANY PART OF THIS CHUNK test_x(mat_empty, class="matrix", dims=c(10,4), all_NA=TRUE) test_x(mat_1, class="matrix", dims=c(10,4), numeric=TRUE, na_check=1, c_pos=c(3,1), c_val = 10) test_x(mat_2, class="matrix", dims=c(10,4), numeric=TRUE, na_check=2, c_pos=c(3,2), c_val = 13) test_x(mat_3, class="matrix", dims=c(10,4), numeric=FALSE, na_check=3, c_pos=c(2,3), c_val="R") test_x(row_4, class="character", dims=4, numeric=FALSE, na_check=1, c_pos=2, c_val=14) test_x(val_6_2, class="numeric", dims=1, numeric=TRUE, na_check=1, c_pos=1, c_val=16) test_x(mat_4, class="matrix", dims=c(10,4), numeric=FALSE, na_check=3, c_pos=c(5,3), c_val="U") test_x(mat_t, class="matrix", dims=c(4,4), numeric=FALSE, na_check=2, c_pos=c(2,2), c_val="12") test_x(mat_big, class="matrix", dims=c(14,4), numeric=FALSE, na_check=3, c_pos=c(2,3), c_val="13") ``` ### Questions about Matrices: 1) Note the column names on mat_4. What do you get when you try to get `names()` from `mat_4`? What about `colnames()`? What about `rownames()`? Can you guess why you get all these results? > Answer: 2) Consider the code below: ```{r matrices_2, echo=TRUE} mat_letters <- matrix(letters, ncol=2) ``` + This matrix goes from `"a"` to `"m"` in the first column and `"n"` to `"z"` in the second. What would be an easy way to make the matrix go in alphabetical order left to right, top to bottom? > Answer: 3) Consider the code below: ```{r matrices_3, echo=TRUE} math_mat <- matrix(1:5, nrow=5, ncol=5) math_vec <- 1:5 ``` + Without showing your code (use the console), look at `math_mat` and `math_vec`. When you add `math_mat + math_vec`, what happens? + Without showing your code (use the console), what is the difference between the results from `math_mat %*% math_vec` and from `math_mat * math_vec`. Can you tell what is happening? > Answer: ## Lists: ### Code with Lists: ```{r lists_1} # 1) Use "list()" to create a list that contains "vec_num" and "row_4", and assign the names # "vec_num" and "row_4" to these two elements of "list_1". list_1 <- NULL # 2) Using "$", extract "row_4" from "list_1" and assign it to the object "row_4_2". row_4_2 <- NULL # 3) Create another list that contains "val_6_2" and "mat_big". list_2 <- NULL # 4) Combine list_1 and list_2 together using "c()" and assign them to "list_3" list_3 <- NULL # 5) Use "unlist()" to turn "list_3" into a vector and assign it to "vector_3" vector_3 <- NULL # 6) Use "as.list()" to convert "vector_3" into a list and assign it to "list_big" list_big <- NULL # 7) Now copy "list_3" as "list_4" (one line of code). Then use "[[ ]]" to assign "list_3" as the last (fifth) element of "list_4"; # that is, you should have a list object with five elements named "list_4" that contains the same four # elements as "list_3" plus a fifth element that is -all- four elements of "list_3" as one object. list_4 <- NULL # 8) Select the third element (that is, the sub-element) of the the fifth element of "list_4" and assign it # to element_5_3 using "[[ ]]". element_5_3 <- NULL # 9) Lastly, repeat the previous assignment of the third element of the fifth element, but # extract the element as a list rather than scalar using "[ ]" and assign it to "list_5_3". list_5_3 <- NULL ``` *How'd you do?* ```{r lists_1_test, echo=FALSE, message=TRUE, collapse=TRUE, comment=''} # DO NOT EDIT ANY PART OF THIS CHUNK test_x(list_1, class="list", dims=2, na_check=2, c_pos=c(2,3), c_val="T", nam_pos = 2, nam_val="row_4") test_x(row_4_2, dim=4, c_pos=2, c_val="14") test_x(list_2, class="list", dims=2, na_check=2, c_pos=c(2,16), c_val=12) test_x(list_3, class="list", dims=4, na_check=2, c_pos=c(4,3), c_val="Q") test_x(vector_3, class="character", dims=71, c_pos=29, c_val="45") test_x(list_big, class="list", dims=71, c_pos=29, c_val="45") test_x(list_4, class="list", dims=5, c_pos=c(5,4,17), c_val="R") test_x(element_5_3, class="numeric", dims=1, c_pos=1, c_val=16) test_x(list_5_3, class="list", dims=1, c_pos=c(1,1), c_val=16) ``` ### Questions About Lists: Many functions in R produce lists as output because they produce objects with different types of data and of different lengths. For instance, consider the linear regression saved to `lm.output` below. Don't worry if you are not familiar with regression, we're just concerned with what the function produces! ```{r lm, echo=TRUE} lm.output <- lm(mpg ~ wt, data=mtcars) lm.output ``` 1) **How many elements** does this call to `lm()` produce? *Hint: use the function* `length()`. > Answer: 2) What are the **dimensions** of the `model` element? > Answer: 3) What are the **values** of the `coefficients` for the `intercept` and `wt`? Remember to call on them from `lm.output` object for your answer! > Answer: ## Data Frames: ### Code with Data Frames: ```{r data_frames_1} # 1) Use "data.frame()" to combine "vec_num" (first column) and "vec_int" (second column) into "df_1". df_1 <- NULL # 2) Use "$" to extract "vec_num" from "df_1", reverse it with "rev()", and assign it as the vector "vec_num_2". vec_num_2 <- NULL # 3) Use "$" to add "vec_num_2" to "df_2" as a new column with the name "number_vector". df_2 <- df_1 # DO NOT EDIT THIS LINE; modify code below it. df_2 <- NULL # 4) Combine "df_2" with itself using "rbind()" to create "df_3" df_3 <- NULL ``` *How'd you do?* ```{r data_frames_1_test, echo=FALSE, message=TRUE, collapse=TRUE, comment=''} # DO NOT EDIT ANY PART OF THIS CHUNK test_x(df_1, class="data.frame", dims=c(10,2), c_pos=c(3,2), c_val=13) test_x(vec_num_2, class="numeric", dims=10, c_pos=10, c_val=0) test_x(df_2, class="data.frame", dims=c(10,3), c_pos=c(1,3), c_val=45, nam_pos=3, nam_val="number_vector") test_x(df_3, class="data.frame", dims=c(20,3), c_pos=c(11,3), c_val=45) ``` ### Questions About Data Frames: 1) Use `names()`, `colnames()`, and `rownames()` on df_1. How does this compare to the behavior of these functions on lists and matrices? > Answer: 2) Similarly, how do the results of `length()` and `dim()` differ between data frames, lists, matrices, and vectors? > Answer: