--- title: "Lecture 6: Augmented vectors, Factor + Labelled Variables" subtitle: "" author: date: fontsize: 8pt classoption: dvipsnames # for colors urlcolor: blue output: beamer_presentation: keep_tex: true toc: false slide_level: 3 theme: default # AnnArbor # push to header? #colortheme: "dolphin" # push to header? #fonttheme: "structurebold" highlight: default # Supported styles include "default", "tango", "pygments", "kate", "monochrome", "espresso", "zenburn", and "haddock" (specify null to prevent syntax highlighting); push to header df_print: default #default # tibble # push to header? latex_engine: xelatex # Available engines are pdflatex [default], xelatex, and lualatex; The main reasons you may want to use xelatex or lualatex are: (1) They support Unicode better; (2) It is easier to make use of system fonts. includes: in_header: ../beamer_header.tex #after_body: table-of-contents.txt --- ```{r, echo=FALSE, include=FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>", highlight = TRUE) #comment = "#>" makes it so results from a code chunk start with "#>"; default is "##" ``` ```{r, echo=FALSE, include=FALSE} #THIS CODE DOWNLOADS THE MOST RECENT VERSION OF THE FILE beamder_header.tex AND SAVES IT TO THE DIRECTORY ONE LEVEL UP FROM THIS .RMD LECTURE FILE download.file(url = 'https://raw.githubusercontent.com/ozanj/rclass/master/lectures/beamer_header.tex', destfile = '../beamer_header.tex', mode = 'wb') ``` ```{r, echo=FALSE, include=FALSE} #DO NOT WORRY ABOUT THIS if(!file.exists('data-structures-overview.png')){ download.file(url = 'https://raw.githubusercontent.com/ozanj/rclass/master/lectures/lecture5/data-structures-overview.png', destfile = 'data-structures-overview.png', mode = 'wb') } ``` # Introduction ### Logistics __Reading to do before next class:__ - GW 15.1 - 15.2 (factors) [this is like 2-3 pages] - [OPTIONAL] GW 15.3 - 15.5 (remainder of "factors" chapter) - [OPTIONAL] GW 20.6 - 20.7 (attributes and augmented vectors) - [OPTIONAL] GW 10 (tibbles) ### What we will do today \tableofcontents ```{r, eval=FALSE, echo=FALSE} #Use this if you want TOC to show level 2 headings \tableofcontents #Use this if you don't want TOC to show level 2 headings \tableofcontents[subsectionstyle=hide/hide/hide] ``` ### Libraries we will use today "Load" the package we will use today (output omitted) - __you must run this code chunk after installing these packages__ ```{r, message=FALSE} library(tidyverse) library(haven) library(labelled) library(lubridate) ``` __If package not yet installed__, then must install before you load. Install in "console" rather than .Rmd file - Generic syntax: `install.packages("package_name")` - Install "tidyverse": `install.packages("tidyverse")` Note: when we load package, name of package is not in quotes; but when we install package, name of package is in quotes: - `install.packages("tidyverse")` - `library(tidyverse)` # Augmented vectors ### Data we will use to introduce augmented vectors ```{r} rm(list = ls()) # remove all objects #load("../../data/prospect_list/western_washington_college_board_list.RData") load(url("https://github.com/ozanj/rclass/raw/master/data/prospect_list/wwlist_merged.RData")) ``` ## Review data types and structures ### __Vectors__ are the primary data structures in R \medskip Two types of vectors: 1. __atomic vectors__ 1. __lists__ \medskip ![Overview of data structures (Grolemund and Wickham, 2018)](data-structures-overview.png){width=60%} ### Review data structures: atomic vectors \medskip An __atomic vector__ is a collection of values - each value in an atomic vector is an __element__ - all elements within vector must have same __data type__ ```{r} (a <- c(1,2,3)) # parentheses () assign and print object in one step length(a) typeof(a) str(a) ``` Can assign __names__ to vector elements, creating a __named atomic vector__ ```{r} (b <- c(v1=1,v2=2,v3=3)) length(b) typeof(b) str(b) ``` ### Review data structures: lists \medskip - Like atomic vectors, __lists__ are objects that contain __elements__ - However, __data type__ can differ across elements within a list - an element of a list can be another list ```{r} list_a <- list(1,2,"apple") typeof(list_a) length(list_a) str(list_a) list_b <- list(1, c("apple", "orange"), list(1, 2)) length(list_b) str(list_b) ``` ### Review data structures: lists Like atomic vectors, elements within a list can be named, thereby creating a __named list__ ```{r} # not named str(list_b) # named list_c <- list(v1=1, v2=c("apple", "orange"), v3=list(1, 2, 3)) str(list_c) ``` ### Review data structures: a data frame is a list A __data frame__ is a list with the following characteristics: - All the elements must be __vectors__ with the same __length__ - Data frames are __augmented lists__ because they have additional __attributes__ [described later] ```{r} #a regular list list_d <- list(col_a = c(1,2,3), col_b = c(4,5,6), col_c = c(7,8,9)) typeof(list_d) str(list_d) #a data frame df_a <- data.frame(col_a = c(1,2,3), col_b = c(4,5,6), col_c = c(7,8,9)) typeof(df_a) str(df_a) ``` ## Attributes and augmented vectors ### Atomic vectors versus augmented vectors __Atomic vectors__ [our focus so far] - I think of atomic vectors as "just the data" - Atomic vectors are the building blocks for augmented vectors __Augmented vectors__ - __Augmented vectors__ are atomic vectors with additional __attributes__ attached __Attributes__ - __Attributes__ are additional "metadata" that can be attached to any object (e.g., vector or list) - Examples of some important attributes in R: - __Names__: name the elements of a vector (e.g., variable names) - __value labels__: character labels (e.g., "Charter School") attached to numeric values - __Object class__: How object should be treated by object oriented programming language [discussed below] __Main takaway__: - Augmented vectors are atomic vectors (just the data) with additional attributes attached ### Attributes in vectors Identify attributes in any object using the `attributes()` function ```{r} #vector with no attributes vector1 <- c(1,2,3,4) vector1 attributes(vector1) #vector with name attributes vector2 <- c(a = 1, b= 2, c= 3, d = 4) vector2 attributes(vector2) ``` ### Attributes in lists ```{r} #no attributes list1 <- list(c(1,2,3), c(4,5,6)) attributes(list1) #list with attributes list2 <- list(col_a = c(1,2,3), col_b = c(4,5,6)) str(list2) attributes(list2) #data frame with attributes list3 <- data.frame(col_a = c(1,2,3), col_b = c(4,5,6)) str(list3) attributes(list3) ``` ## Object class ### Object class \medskip Every object in R has a __class__ - Object class defines rules for how object can be treated by object oriented programming language (e.g., which functions you can apply to object) - class is an __attribute__ of an object Identify the class of an object using the `class()` function ```{r} (vector2 <- c(a = 1, b= 2, c= 3, d = 4)) class(vector2) ``` When I encounter a new object I often investigate object by applying `typeof()`, `class()`, and `attributes()` functions to that object ```{r} vector2 typeof(vector2) class(vector2) attributes(vector2) ``` ### Object class Why is __class__ important? - Specific functions usually work with only particular __classes__ of objects - e.g., "date"" functions usually only work on objects with a date class - "string" functions usually only work with on objects with a character class - Functions that do mathematical computation usually work on objects with a numeric class - Note: functions care about object __class__, not object __type__ object with `numeric` class (output omitted) ```{r, eval=FALSE} str(wwlist) typeof(wwlist$med_inc_zip) class(wwlist$med_inc_zip) sum(wwlist$med_inc_zip[1:10], na.rm = TRUE) # numeric function # load library with date functions library(lubridate) #Sys.setenv(TZ="America/Los_Angeles") #setting time zone to Los Angeles time year(wwlist$med_inc_zip[1:10]) # date function ``` ### Object class Why is __class__ important? - Specific functions usually work with only particular __classes__ of objects - Note: functions care about object __class__, not object __type__ Object with `character` class ```{r, eval=FALSE} str(wwlist$hs_city) typeof(wwlist$hs_city) class(wwlist$hs_city) tolower(wwlist$hs_city[1:10]) # string function sum(wwlist$hs_city, na.rm = TRUE) # numeric function ``` Object with a date class ```{r, eval=FALSE} typeof(wwlist$receive_date) class(wwlist$receive_date) year(wwlist$receive_date[1:10]) # date function sum(wwlist$receive_date) # numeric function ``` ### Class and object oriented programming Definition of object oriented programming from this [LINK](https://www.webopedia.com/TERM/O/object_oriented_programming_OOP.html) > "Object-oriented programming (OOP) refers to a type of computer programming in which programmers define not only the data type of a data structure, but also the types of operations (functions) that can be applied to the data structure." Object __class__ is fundamental to object oriented programming because: - object class determines which functions can be applied to the object - object class also determines what those functions do to the object Many different object classes exist in R - we can also create our own classes - but in this course we will work with classes that have been created by others ## Class == factor ### Factors __Factors__ are an object _class_ used to display categorical data (e.g., marital status) - A factor is an __augmented vector__ built by attaching a "levels" attribute to an (atomic) integer vectors Usually, we would prefer a categorical variable (e.g., race, school type) to be a factor variable rather than a character variable - So far in the course I have made all categorical variables character variables because we had not introduced factors yet Below, I'll create a factor version of the character variable `ethn_code` - (don't worry about understanding this code; I'll explain it later) ```{r} str(wwlist$ethn_code) class(wwlist$ethn_code) # create factor var; tidyverse approach wwlist <- wwlist %>% mutate(ethn_code_fac = factor(ethn_code)) #wwlist$ethn_code_fac <- factor(wwlist$ethn_code) # base r approach str(wwlist$ethn_code_fac) ``` ### Factors A factor is an __augmented vector__ built by attaching a "levels" attribute to an (atomic) integer vector Compare (character) `ethn_code` to (factor) `ethn_code_fac` (output omitted) ```{r, results="hide"} #character var typeof(wwlist$ethn_code) class(wwlist$ethn_code) str(wwlist$ethn_code) attributes(wwlist$ethn_code) #factor var typeof(wwlist$ethn_code_fac) class(wwlist$ethn_code_fac) str(wwlist$ethn_code_fac) attributes(wwlist$ethn_code_fac) ``` __Main takeaway__ - `ethn_code_fac` has `type=integer` and `class=factor` because the variable has a "levels" attribute - Underlying data are integers but levels attribute is used to display the data. ```{r} wwlist$ethn_code_fac[1:4] # print first few obs of ethn_code_fac ``` ### Working with factor variables ```{r, results='hide'} attributes(wwlist$ethn_code_fac) ``` Refer to categories of a factor by the values of the __level attribute__ rather than the underlying values of the variable __Task__ - count the number of prospects in object `wwlist` who identify as "white" ```{r} # referring to variable value; this doesn't work wwlist %>% filter(ethn_code_fac==10) %>% count #referring to value of level attribute; this works wwlist %>% filter(ethn_code_fac=="white") %>% count ``` ### Working with factor variables __Task__ - count the number of prospects in object `wwlist` who identify as "white" If you want to refer to underlying values, then apply `as.integer()` function to the factor variable ```{r} attributes(wwlist$ethn_code_fac) wwlist %>% filter(as.integer(ethn_code_fac)==10) %>% count ``` ### How to identify the variable values associated with factor levels Let's create a factor version of the character variable `psat_range` ```{r} wwlist <- wwlist %>% mutate(psat_range_fac = factor(psat_range)) # create factor var; ``` Run below code in console rather than code chunk to see values associated with each factor ```{r, results="hide"} wwlist %>% count(psat_range_fac) attributes(wwlist$psat_range_fac) levels(wwlist$psat_range_fac) #starts at 1 nlevels(wwlist$psat_range_fac) #7 levels total levels(wwlist$psat_range_fac)[5] #prints levels 1-3 ``` Once you know values associated with factor, you can filter based on values ```{r} wwlist %>% filter(as.integer(psat_range_fac)==4 | as.integer(psat_range_fac)==5) %>% count() ``` Or you can just filter based on value of __factor levels__ ```{r} wwlist %>% filter(psat_range=="1270-1520") %>% count() ``` ### Creating factor variables from character variables or from integer variables See Appendix ### Factor student exercise 1. After running the code below, use `typeof`, `class`, `str`, and `attributes` functions to check the new variable `receive_year` 2. Create a factor variable from the input variable `receive_year` and name it `receive_year_fac` 3. Run the same functions (`typeof`, `class`, etc.) from the first question using the new variable you created 4. Get a count of `receive_year_fac`. __hint:__ you could also run this in the console to see values associated with each factor Run this code to create a year variable from the input variable "receive_date" ```{r, results="hide", message=FALSE} #wwlist %>% glimpse() library(lubridate) #load library if you haven't already wwlist <- wwlist %>% mutate(receive_year = year(receive_date)) #creating year variable with the lubridate package #Check variable wwlist %>% count(receive_year) wwlist %>% group_by(receive_year) %>% count(receive_date) ``` ### Factor student exercise solutions 1. Use `typeof`, `class`, `str`, and `attributes` functions to check the new variable `receive_year` ```{r} typeof(wwlist$receive_year) class(wwlist$receive_year) str(wwlist$receive_year) attributes(wwlist$receive_year) ``` ### Factor student exercise solutions 2. Now create a factor variable from the input variable `receive_year` and name it `receive_year_fac` ```{r} # create factor var; tidyverse approach wwlist <- wwlist %>% mutate(receive_year_fac = factor(receive_year)) ``` ### Factor student exercise solutions 3. Run the same functions (`typeof`, `class`, etc.) from the first question using the new variable you created ```{r} typeof(wwlist$receive_year_fac) class(wwlist$receive_year_fac) str(wwlist$receive_year_fac) attributes(wwlist$receive_year_fac) ``` ### Factor student exercise solutions 4. Get a count of `receive_year_fac`. __hint:__ you could also run this in the console to see values associated with each factor ```{r} wwlist %>% count(receive_year_fac) ``` ## Class == labelled ### Data we will use to introduce `labelled` class High school longitudinal surveys from National Center for Education Statistics (NCES) - Follow U.S. students from high school through college, labor market We will be working with [High School Longitudinal Study of 2009 (HSLS:09)](https://nces.ed.gov/surveys/hsls09/index.asp) - Follows 9th graders from 2009 - Data collection waves - Base Year (2009) - First Follow-up (2012) - 2013 Update (2013) - High School Transcripts (2013-2014) - Second Follow-up (2016) ### `haven` package [`haven`](https://haven.tidyverse.org/), which is part of __tidyverse__, "enables R to read and write various data formats" from the following statistical packages: - SAS - SPSS - Stata When using `haven` to read data, resulting R objects have these characteristics: - Are __tibbles__, a particular type of data frame we discuss in future weeks - Transform variables with "value labels" into the `labelled()` class [our focus today] - `labelled` is an object __class__ created by folks who created `haven` package - `labelled` is an object class, just like `factor` is an object class - `labelled` and `factor` classes are both viable alternatives for categorical variables - Helpful description of `labelled` class [HERE](https://haven.tidyverse.org/articles/semantics.html) - Dates and times converted to R date/time classes - Character vectors not converted to factors ### `haven` package Use `read_dta()` function from `haven` to import Stata dataset into R ```{r, results="hide"} hsls <- read_dta(file="https://github.com/ozanj/rclass/raw/master/data/hsls/hsls_stu_small.dta") ``` Let's examine the data [you __must__ run this code chunk] ```{r, results="hide"} names(hsls) names(hsls) <- tolower(names(hsls)) # convert names to lowercase names(hsls) str(hsls) # ugh str(hsls$s3classes) attributes(hsls$s3classes) typeof(hsls$s3classes) class(hsls$s3classes) ``` ### `labelled` package Purpose of the `labelled` package is to work with data imported from SPSS/Stata/SAS using the `haven` package. - In particular, `labelled` package creates functions to work with objects that have `labelled` class - From package documentation: "purpose of the `labelled` package is to provide functions to manipulate _metadata_ as variable labels, value labels and defined missing values using the `labelled` class and the `label` attribute introduced in `haven` package. - More info on the `labelled` package: [LINK](https://cran.r-project.org/web/packages/labelled/vignettes/intro_labelled.html) Functions in `labelled` package - [Full list](https://www.rdocumentation.org/packages/labelled/versions/1.1.0) - A couple relevant functions - `val_labels`: get or set variable _value labels_ - `var_label`: get or set a _variable label_ ```{r, results="hide"} attributes(hsls$s3classes) hsls %>% select(s3classes) %>% var_label() hsls %>% select(s3classes) %>% val_labels() ``` ### What is `labelled` class? \medskip - `labelled` is an object class created by the `haven` package for importing variables from SAS/SPSS/Stata that have __value labels__ - __value labels__ [in Stata] are labels attached to specific values of a variable: - e.g., variable value `1` attached to value label "married", `2`="single", `3`="divorced" - Variables in an R data frame with `class==labelled`: - data `type` can be numeric(double) or character - To see `value labels` associated with each value: - `attr(data_frame_name$variable_name,"labels")` - e.g., `attr(hsls$s3classes,"labels")` Let's investigate the attributes of `hsls$s3classes` ```{r, results="hide"} typeof(hsls$s3classes) class(hsls$s3classes) str(hsls$s3classes) attributes(hsls$s3classes) ``` use `attr(object_name,"attribute_name")` to refer to each attribute ```{r, results="hide"} attr(hsls$s3classes,"label") attr(hsls$s3classes,"labels") attr(hsls$s3classes,"class") attr(hsls$s3classes,"format.stata") ``` ### Working with `labelled` class data \medskip Show variable labels (`var_label`); and show value labels (`val_labels`) ```{r, results="hide"} hsls %>% select(s3classes,s3clglvl) %>% var_label #show variable label hsls %>% select(s3classes,s3clglvl) %>% val_labels #show value labels ``` Create frequency tables with `labelled` class variables using `count()` - Default setting is to show variable __values__ not __value labels__ ```{r, results="hide"} hsls %>% count(s3classes) #investigate the object created hsls_freq_temp <- hsls %>% count(s3classes) hsls_freq_temp rm(hsls_freq_temp) ``` To make frequency table show __value labels__ add `%>% as_factor()` to pipe - `as_factor()` is function from `haven` that converts an object to a factor ```{r, results="hide"} hsls %>% count(s3classes) %>% as_factor() #investigate the object created hsls_freq_temp <- hsls %>% count(s3classes) %>% as_factor() hsls_freq_temp rm(hsls_freq_temp) ``` ### Working with `labelled` class data To isolate values of `labelled` class variables in `filter()` function: - refer to variable __value__, not the __value label__ __Task__ - how many observations in var `s3classes` associated with "Unit non-response" - how many observations in var `s3classes` associated with "Yes" General steps to follow: 1. investigate object 1. use filter to isolate desired observations Investigate object ```{r, results="hide"} class(hsls$s3classes) hsls %>% select(s3classes,s3clglvl) %>% var_label #show variable label hsls %>% select(s3classes,s3clglvl) %>% val_labels #show value label hsls %>% count(s3classes) # freq table, values hsls %>% count(s3classes) %>% as_factor() # freq table, value labels ``` filter specific values ```{r, results="hide"} hsls %>% filter(s3classes==-8) %>% count() # -8 = unit non-response hsls %>% filter(s3classes==1) %>% count() # 1 = yes ``` ### Labelled student exercise 1. Get variable and value labels of `s3hs` 2. Get a count of the variable showing the values and the value labels. __hint__ use factor() 3. Filter if value is associated with "Missing" 4. Filter if value is associated with "Missing" or "Unit non-response" ### Labelled student exercise solutions 1. Get variable and value labels of `s3hs` ```{r} hsls %>% select(s3hs) %>% var_label() hsls %>% select(s3hs) %>% val_labels() ``` ### Labelled student exercise solutions 2. Get a count of the variable `s3hs` showing the value labels. __hint__ use factor() ```{r} hsls %>% count(s3hs) hsls %>% count(s3hs) %>% as_factor() ``` ### Labelled student exercise solutions 3. Filter if value is associated with "Missing" ```{r} hsls %>% filter(s3hs== -9) %>% count() ``` ### Labelled student exercise solutions 4. Filter if value is associated with "Missing" or "Unit non-response" ```{r} hsls %>% filter(s3hs== -9 | s3hs== -8) %>% count() ``` ## Comparing labelled class to factor class ### Comparing `class==labelled` to `class==factor` | | `class==labelled` | `class==factor` |---|----------|-------------| | data type | numeric or character | integer | | name of value label attribute | labels | levels | | refer to data using | variable values | levels attribute | ### Converting `class==labelled` to `class==factor` The `as_factor()` function from `haven` package converts variables with `class==labelled` to `class==factor` - Can be used for descriptive statistics ```{r, results="hide"} hsls %>% select(s3classes) %>% count(s3classes) %>% as_factor() ``` - Can create object with some or all `labelled` vars converted to `factor` ```{r} hsls_f <- as_factor(hsls,only_labelled = TRUE) ``` Let's examine this object ```{r, results="hide"} glimpse(hsls_f) hsls_f %>% select(s3classes,s3clglvl) %>% str() typeof(hsls_f$s3classes) class(hsls_f$s3classes) attributes(hsls_f$s3classes) hsls_f %>% select(s3classes) %>% var_label() hsls_f %>% select(s3classes) %>% val_labels() ``` ### Working with `class==factor` data Showing values associated with factor levels ```{r} hsls_f %>% count(s3classes) ``` In code, refer `level` attribute not variable value ```{r} hsls_f %>% filter(s3classes=="Yes") %>% count(s3classes) ``` # Creating factor variables ### Create factors [from string variables] To create a factor variable from string variable 1. create a character vector containing underlying data 1. create a vector containing valid levels 3. Attach levels to the data using the `factor()` function ```{r} #underlying data: months my fam is born x1 <- c("Jan", "Aug", "Apr", "Mar") #create vector with valid levels month_levels <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec") #attach levels to data x2 <- factor(x1, levels = month_levels) ``` Note how attributes differ ```{r} str(x1) str(x2) ``` Sorting differs ```{r} sort(x1) sort(x2) ``` ### Create factors [from string variables] Let's create a character version of variable `hs_state` and then turn it into a factor ```{r, eval=FALSE} #wwlist %>% # count(hs_state) #Subset obs to West Coast states wwlist_temp <- wwlist %>% filter(hs_state %in% c("CA", "OR", "WA")) #Create character version of high school state for West Coast states only wwlist_temp$hs_state_char <- as.character(wwlist_temp$hs_state) #investigate character variable str(wwlist_temp$hs_state_char) class(wwlist_temp$hs_state_char) table(wwlist_temp$hs_state_char) #create new variable that assigns levels wwlist_temp$hs_state_fac <- factor(wwlist_temp$hs_state_char, levels = c("CA","OR","WA")) str(wwlist_temp$hs_state_fac) attributes(wwlist_temp$hs_state_fac) #wwlist_temp %>% # count(hs_state_fac) rm(wwlist_temp) wwlist$hs_state_fac <- as_factor(wwlist_temp$hs_state_char) ``` ### Create factors [from string variables] How the `levels` argument works when underlying data is character - Matches value of underlying data to value of the level attribute - Converts underlying data to integer, with level attribute attached \medskip See chapter 15 of Wickham for more on factors (e.g., modifying factor order, modifying factor levels) ### Creating factors [from integer vectors] Factors are just integer vectors with level attributes attached to them. So, to create a factor: 1. create a vector for the underlying data 1. create a vector that has level attributes 3. Attach levels to the data using the `factor()` function ```{r} a1 <- c(1,1,1,0,1,1,0) #a vector of data a2 <- c("zero","one") #a vector of labels #attach labels to values a3 <- factor(a1, labels = a2) a3 str(a3) ``` Note: By default, `factor()` function attached "zero" to the lowest value of vector `a1` because "zero" was the first element of vector `a2` ### Creating factors [from integer vectors] Let's turn an integer variable into a factor variable in the `wwlist` data frame Create integer version of `receive_year` ```{r} #typeof(wwlist_temp$receive_year) wwlist_temp <- wwlist %>% filter(zip5 %in% c("98103", "98030", "98290")) wwlist_temp$zip_int <- as.integer(wwlist_temp$zip5) str(wwlist_temp$zip_int) typeof(wwlist_temp$zip_int) ``` ### Creating factors [from integer vectors] Assign levels to values of integer variable ```{r} # variable zip_int is coded 98103, 98030 or 98290 # we want to attach value labels 98103=nine-eight-one-zero-three, 98030=nine-eight-zero-three-zero, 98290=nine-eight-two-nine-zero wwlist_temp$zip_fac <- factor(wwlist_temp$zip_int, levels = c(98103, 98030, 98290), labels = c("nine-eight-one-zero-three", "nine-eight-zero-three-zero", "nine-eight-two-nine-zero")) str(wwlist_temp$zip_fac) str(wwlist_temp$zip5) #Check variable wwlist_temp %>% count(zip_fac) wwlist_temp %>% count(zip_int) ```