--- title: "Assignment 4" subtitle: CRD 150 author: YOUR FULL NAME HERE date: INSERT DATE HERE output: html_document: theme: cosmo --- ```{r setup, include=FALSE} # Do not edit this code block/chunk knitr::opts_chunk$set(echo = TRUE, warning=FALSE, message = FALSE) knitr::opts_knit$set(progress = TRUE, verbose = TRUE) ``` ```{r} #Load in all necessary packages using the function library() ``` 1. As a comprehensive source for neighborhood data, PolicyMap allows you to examine interesting associations across different dimensions of neighborhood health and well-being outside of those provided in the U.S. Census. Let's examine the potential predictors of resident health in the City of Sacramento. Bring into R the dataset sac_health_policymap.csv, which contains census-tract level data on Sacramento downloaded from PolicyMap. I've also uploaded the file on Canvas (Files - Week 4 - Assignment). Consider the dataset to be cleaned and ready for analysis. A record layout of the data can be found here. ```{r} #Add comment explaining code ``` a. Create an appropriate plot that shows the shape of the distribution for the variable health, which measures the percent of residents reporting very good to excellent health. Describe the shape of the distribution. ```{r} #Add comment explaining code ``` b. Examine the association between health and the variable phys. Based on this examination, briefly describe the relationship between the variable health and phys. ```{r} #Add comment explaining code ``` c. Create the appropriate plots showing the association between health and the following variables: foodaccess, unemp, and medinc. ```{r} #Add comment explaining code ``` d. Based on these plots, briefly describe the relationship between health and the three variables. Describe any nonlinearities in the relationships. (2 points) 2. The following questions investigate housing structure in Yolo County. Bring into R the dataset ca_border_tracts.csv, which contains 2019-2023 American Community Survey (ACS) data for census tracts in California and states sharing a boundary with California (Arizona, Nevada, and Oregon). I've also uploaded the file on Canvas (Files - Week 4 - Assignment). Consider the dataset to be cleaned and ready for analysis. A record layout of the data can be found here. ```{r} #Add comment explaining code ``` a. Calculate 90/10 percentile ratios for median household income for each region (Bay Area, Southern California, Other California, Arizona, Nevada, and Oregon). Present these values in a presentation-ready table using flextable(). Which region exhibits the highest neighborhood income inequality? Lowest? ```{r} #Add comment explaining code ``` b. What is the mean, median, interquartile range, and standard deviation of the percentage of houses built after 2000 in Yolo County? Show these values in a presentation-ready table using flextable(). ```{r} #Add comment explaining code ``` c. What is the correlation between median household income and percentage of houses built after 2000 in Yolo County? ```{r} #Add comment explaining code ``` d. Show a plot investigating any potential outliers in the percentage of houses built after 2000 in Yolo County. ```{r} #Add comment explaining code ``` e. Did you find any outliers? If so, how many? What is the mean, median and standard deviation of the percentage of houses built after 2000 in Yolo County without these outliers? What about the correlation between median household income and percentage of houses built after 2000? Present these values in a presentation-ready table using flextable(). ```{r} #Add comment explaining code ``` f. Briefly comment on what you learned in question 2e.