---
title: "Lab 6 Assignment | PAF 516"
subtitle: "Spatio-Temporal Analysis — LISA Trajectories & Space-Time Moran's I"
format:
  html:
    theme: readable
    highlight-style: tango
    toc: true
    self-contained: true
    number-sections: false
execute:
  echo: true
  warning: false
  message: false
  fig-width: 10
---

```{r}
#| label: setup
#| include: false
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, fig.width = 10)
options(tigris_use_cache = TRUE)
options(tigris_progress_bar = FALSE)
```

```{r}
#| label: packages
# ── One-time course environment setup ─────────────────────────────────────
# Sentinel lives in user home (~/.paf516_ready) — persists across all lab
# folders. renv restores ONCE for the entire course, not once per lab.
# To force re-install: delete ~/.paf516_ready and re-render.
if (!file.exists("~/.paf516_ready")) {
  if (!requireNamespace("renv", quietly = TRUE)) install.packages("renv")
  tmp <- tempfile(fileext = ".lock")
  download.file(
    url      = "https://raw.githubusercontent.com/AntJam-Howell/paf-516-labs/main/renv.lock",
    destfile = tmp, mode = "wb"
  )
  renv::restore(lockfile = tmp, prompt = FALSE)
  file.create("~/.paf516_ready")
}

library(tidyverse)
library(tidycensus)
library(sf)
library(spdep)
library(viridis)
library(patchwork)
```

```{r}
#| label: api-key
# census_api_key("YOUR KEY GOES HERE")
```

::: {.callout-important}
## Before you begin
1. **This file contains the same pre-filled code as the tutorial.** Steps 1–7
   below produce the same output you saw in the tutorial. Knit immediately
   (Cmd+Shift+K / Ctrl+Shift+K) to see all results before answering.
2. **The assignment questions are at the top of this document.** Workflow:
   - **Q1** asks you to interpret the trajectory map and transition matrix.
     Knit first, review Steps 1–7 output, then write your answer.
   - **Q2** asks you to rerun the LISA classification at p < 0.01 and compare
     it to the p < 0.05 map. Fill in the scaffold chunks labeled `q2-...` at
     the end of this document, then re-knit.
   - **Q3** asks you to write a policy brief recommending which trajectory
     types to prioritize for intervention.
3. Submit both `Lab6_Assignment_YourLastName.qmd` and
   `Lab6_Assignment_YourLastName.html` to Canvas.
:::

# Assignment Questions

## Q1 — Interpret: Trajectory Map and Transition Matrix {.question}

**Interpret the trajectory map (Step 5) and transition matrix (Step 4). Address all of the following:**

- How many tracts are **persistent hot spots** (HH→HH)? What does this suggest about structural hardship in Maricopa County?
- How many 2013 hot spots **dissolved** by 2019? What might explain their recovery?
- Are there **emerging hot spots** — tracts that became HH between 2013 and 2019? Where do they appear spatially?
- What does the **space-time Moran's I result** (Step 6) tell you about whether change is a spatially spreading process?
- Does the **three-point trend** (Step 7) give more or less confidence in the two-point change scores? What share of "improving" tracts show consistent improvement vs. non-monotone patterns?

*Minimum 200 words. Reference specific numbers from the outputs above.*

*YOUR ANSWER HERE*


## Q2 — Modify: Stricter Significance Threshold (p < 0.01) {.question}

**Rerun the LISA classification using p < 0.01 instead of p < 0.05. The `classify_lisa()` function in Step 3 accepts a `p_thresh` argument — use it here. Reclassify trajectories, produce the trajectory map at p < 0.01, and write a paragraph comparing it to the p < 0.05 map.**

Fill in the scaffold chunks labeled `q2-...` at the end of this document, then re-knit.

*Comparison paragraph (min. 100 words): How does the p < 0.01 trajectory map differ from the p < 0.05 map? Which trajectory categories shrink the most? What does this tell you about multiple testing and the reliability of hot spot classifications?*

*YOUR ANSWER HERE*


## Q3 — Policy Brief {.question}

**You are advising the Maricopa County Office of Community Development. They have a fixed budget and can prioritize only two of the four trajectory types below. Using your trajectory map, space-time Moran's I result, and three-point trend as evidence, write a policy brief paragraph recommending which two to prioritize and why.**

Trajectory types to consider:

- **Persistent Hot Spots (HH→HH):** Deep, long-standing hardship clusters
- **Emerging Hot Spots (→HH):** Deteriorating areas where clustering is forming
- **Dissolving Hot Spots (HH→):** Recovering areas that may need displacement protection
- **HL Outliers:** Isolated pockets of high hardship within low-hardship surroundings

*Minimum 150 words. Ground your recommendation in specific numbers from your analysis. Cite at least one reading from Module 5 or Module 6.*

*YOUR ANSWER HERE*


---

# Overview

**Rename this file** `Lab6_Assignment_YourLastName.qmd` before editing.
Submit both `.qmd` and rendered `.html` to Canvas.

**Index:** 3 variables — poverty rate (C17002), unemployment rate (B23025), median income (B19013, reversed). All pulls use 2010 TIGER/Line tract boundaries.

**Time points:** 2013 (2009–2013 ACS), 2016 (2012–2016 ACS), 2019 (2015–2019 ACS) — all three share 2010 boundaries.

**Checklist:**

- [ ] Renamed with your last name
- [ ] All pre-filled steps execute without error
- [ ] Q1 answer — min. 200 words with specific numbers
- [ ] Q2 code complete with output + comparison paragraph
- [ ] Q3 policy brief — min. 150 words, cites at least one reading
- [ ] Both `.qmd` and `.html` submitted to Canvas


# Step 1: Rebuild Dataset

```{r}
#| label: step.01-vars
hardship_vars <- c(
  pov_below50   = "C17002_002",
  pov_50to99    = "C17002_003",
  pov_den       = "C17002_001",
  unemp_num     = "B23025_005",
  unemp_den     = "B23025_002",
  median_income = "B19013_001"
)
```

```{r}
#| label: step.01-pull
#| results: hide
load_or_pull <- function(rds_path, pull_fn) {
  if (file.exists(rds_path)) {
    readRDS(rds_path)
  } else {
    result <- pull_fn()
    dir.create(dirname(rds_path), showWarnings = FALSE, recursive = TRUE)
    saveRDS(result, rds_path)
    result
  }
}

tract_2013_raw <- load_or_pull(
  "../data/maricopa_tracts_2013.rds",
  function() get_acs(
    geography = "tract", variables = hardship_vars,
    state = "AZ", county = "Maricopa",
    year = 2013, survey = "acs5", geometry = TRUE, progress_bar = FALSE
  )
)
tract_2019_raw <- load_or_pull(
  "../data/maricopa_tracts_2019.rds",
  function() get_acs(
    geography = "tract", variables = hardship_vars,
    state = "AZ", county = "Maricopa",
    year = 2019, survey = "acs5", geometry = TRUE, progress_bar = FALSE
  )
)
```

```{r}
#| label: step.01-pull-offline
#| eval: false
# ── Offline fallback — run this block instead if the Census API is down ──
# Replace the get_acs() chunk above with these readRDS() calls.
# All datasets are pre-built and stored in data/ for offline use.
#
# tract_2013_raw <- readRDS("../data/maricopa_tracts_2013.rds")
# tract_2019_raw <- readRDS("../data/maricopa_tracts_2019.rds")
```

```{r}
#| label: step.01-wrangle
widen_and_rate <- function(df) {
  df %>%
    select(-moe) %>%
    pivot_wider(names_from = variable, values_from = estimate) %>%
    mutate(poverty_rate = (pov_below50 + pov_50to99) / pov_den,
           unemp_rate   = unemp_num / unemp_den) %>%
    filter(!is.na(poverty_rate), !is.na(unemp_rate), !is.na(median_income))
}

tract_2013_wide <- widen_and_rate(tract_2013_raw)
tract_2019_wide <- widen_and_rate(tract_2019_raw)

vars_to_pool <- c("poverty_rate", "unemp_rate", "median_income")
r2013 <- tract_2013_wide %>% st_drop_geometry() %>% select(GEOID, all_of(vars_to_pool))
r2019 <- tract_2019_wide %>% st_drop_geometry() %>% select(GEOID, all_of(vars_to_pool))

pooled_stats <- map_dfr(vars_to_pool, function(v) {
  vals <- c(r2013[[v]], r2019[[v]])
  tibble(variable = v, mean = mean(vals, na.rm = TRUE), sd = sd(vals, na.rm = TRUE))
})

apply_pooled_z <- function(df, stats) {
  for (v in stats$variable) {
    pm <- stats$mean[stats$variable == v]; ps <- stats$sd[stats$variable == v]
    df[[paste0("z_", v)]] <- (df[[v]] - pm) / ps
  }
  df$z_median_income <- -1 * df$z_median_income
  df$hardship_index <- rowMeans(
    df[, c("z_poverty_rate", "z_unemp_rate", "z_median_income")], na.rm = TRUE)
  return(df)
}

r2013_z <- apply_pooled_z(r2013, pooled_stats)
r2019_z <- apply_pooled_z(r2019, pooled_stats)

change_df <- tract_2019_wide %>%
  select(GEOID, NAME) %>%
  left_join(r2019_z %>% select(GEOID, hardship_2019 = hardship_index), by = "GEOID") %>%
  inner_join(r2013_z %>% select(GEOID, hardship_2013 = hardship_index), by = "GEOID") %>%
  mutate(hardship_change = hardship_2019 - hardship_2013)

cat("Tracts:", nrow(change_df), "| Mean change:", round(mean(change_df$hardship_change), 3), "\n")
```


# Step 2: Spatial Weights

```{r}
#| label: step.02-weights
nb <- poly2nb(change_df, queen = TRUE)
w  <- nb2listw(nb, style = "W", zero.policy = TRUE)
```


# Step 3: LISA — 2013 and 2019

```{r}
#| label: step.03-lisa
classify_lisa <- function(x, w, p_thresh = 0.05) {
  lisa <- localmoran(x, w, zero.policy = TRUE)
  z    <- scale(x)[, 1]
  wz   <- lag.listw(w, z, zero.policy = TRUE)
  p    <- lisa[, "Pr(z != E(Ii))"]
  case_when(
    z > 0 & wz > 0 & p < p_thresh ~ "HH",
    z < 0 & wz < 0 & p < p_thresh ~ "LL",
    z > 0 & wz < 0 & p < p_thresh ~ "HL",
    z < 0 & wz > 0 & p < p_thresh ~ "LH",
    TRUE ~ "NS"
  )
}

change_df <- change_df %>%
  mutate(
    lisa_2013 = classify_lisa(hardship_2013, w),
    lisa_2019 = classify_lisa(hardship_2019, w)
  )

cat("2013 LISA:\n"); print(table(change_df$lisa_2013))
cat("2019 LISA:\n"); print(table(change_df$lisa_2019))
```


# Step 4: Transition Matrix

```{r}
#| label: step.04-transition
cat("── Transition matrix (rows = 2013, columns = 2019) ──\n")
print(table(`2013` = change_df$lisa_2013, `2019` = change_df$lisa_2019))

cat("\n── Row proportions ──\n")
print(round(prop.table(
  table(`2013` = change_df$lisa_2013, `2019` = change_df$lisa_2019),
  margin = 1) * 100, 1))
```


# Step 5: Trajectory Classification and Map

```{r}
#| label: step.05-trajectory
change_df <- change_df %>%
  mutate(trajectory = case_when(
    lisa_2013 == "HH" & lisa_2019 == "HH" ~ "Persistent HH",
    lisa_2013 != "HH" & lisa_2019 == "HH" ~ "Emerging HH",
    lisa_2013 == "HH" & lisa_2019 != "HH" ~ "Dissolving HH",
    lisa_2013 == "LL" & lisa_2019 == "LL" ~ "Persistent LL",
    lisa_2013 != "LL" & lisa_2019 == "LL" ~ "Emerging LL",
    lisa_2013 == "HL" | lisa_2019 == "HL" ~ "HL Outlier",
    TRUE ~ "Stable NS"
  ))

cat("── Trajectory distribution ──\n")
change_df %>% st_drop_geometry() %>%
  count(trajectory, sort = TRUE) %>%
  mutate(pct = round(100 * n / sum(n), 1)) %>% print()
```

```{r}
#| label: step.05-map
#| fig-height: 9
traj_colors <- c(
  "Persistent HH" = "#D7191C", "Emerging HH"   = "#FDAE61",
  "Dissolving HH" = "#FEE08B", "Persistent LL"  = "#2C7BB6",
  "Emerging LL"   = "#ABD9E9", "HL Outlier"     = "#9E5BB4",
  "Stable NS"     = "#CCCCCC"
)

ggplot(change_df) +
  geom_sf(aes(fill = trajectory), color = NA) +
  scale_fill_manual(values = traj_colors,
                    name = "Trajectory\n(2013 → 2019)") +
  theme_void() +
  labs(title    = "Neighborhood Hardship Trajectories, 2013–2019",
       subtitle = "Maricopa County Census Tracts",
       caption  = "LISA run on 2013 and 2019 pooled-standardized economic hardship index (queen contiguity, p < 0.05)")
```


# Step 6: Space-Time Moran's I

```{r}
#| label: step.06-moran
set.seed(42)
moran_chg <- moran.mc(change_df$hardship_change, w, nsim = 999, zero.policy = TRUE)

cat("Space-Time Moran's I (on 2013→2019 change score):\n")
cat("  I =", round(moran_chg$statistic, 4), "\n")
cat("  p =", round(moran_chg$p.value, 4), "\n")
```

```{r}
#| label: step.06-moran-plot
#| fig-height: 5
z_chg  <- scale(change_df$hardship_change)[, 1]
wz_chg <- lag.listw(w, z_chg, zero.policy = TRUE)

ggplot(tibble(z = z_chg, wz = wz_chg), aes(x = z, y = wz)) +
  geom_hline(yintercept = 0, color = "gray70") +
  geom_vline(xintercept = 0, color = "gray70") +
  geom_point(alpha = 0.35, size = 1.2, color = "#8C1D40") +
  geom_smooth(method = "lm", se = FALSE, color = "#FFC627", linewidth = 1) +
  theme_minimal() +
  labs(title    = "Moran Scatterplot: Change Scores (2013 → 2019)",
       subtitle = paste0("I = ", round(moran_chg$statistic, 3),
                         "  |  p = ", round(moran_chg$p.value, 3)),
       x = "Standardized Change", y = "Spatial Lag of Change")
```


# Step 7: Three-Point Trend

```{r}
#| label: step.07-pull-2016
#| results: hide
# year = 2016: 2012–2016 ACS | also uses 2010 TIGER/Line boundaries
tract_2016_raw <- load_or_pull(
  "../data/maricopa_tracts_2016.rds",
  function() get_acs(
    geography = "tract", variables = hardship_vars,
    state = "AZ", county = "Maricopa",
    year = 2016, survey = "acs5", geometry = FALSE, progress_bar = FALSE
  )
)
```

```{r}
#| label: step.07-pull-2016-offline
#| eval: false
# ── Offline fallback — run this block instead if the Census API is down ──
# Replace the get_acs() chunk above with these readRDS() calls.
# All datasets are pre-built and stored in data/ for offline use.
#
# tract_2016_raw <- readRDS("../data/maricopa_tracts_2016.rds")
```

```{r}
#| label: step.07-trend
tract_2016_wide <- widen_and_rate(tract_2016_raw %>% mutate(geometry = NULL))
r2016 <- tract_2016_wide %>% select(GEOID, all_of(vars_to_pool))
r2016_z <- apply_pooled_z(r2016, pooled_stats)

trend_df <- change_df %>%
  st_drop_geometry() %>%
  select(GEOID, hardship_2013, hardship_2019) %>%
  inner_join(r2016_z %>% select(GEOID, hardship_2016 = hardship_index), by = "GEOID") %>%
  mutate(
    trend_type = case_when(
      hardship_2013 > hardship_2016 & hardship_2016 > hardship_2019 ~ "Consistently improving",
      hardship_2013 < hardship_2016 & hardship_2016 < hardship_2019 ~ "Consistently worsening",
      hardship_2013 > hardship_2016 & hardship_2016 < hardship_2019 ~ "V-shaped (improved mid)",
      hardship_2013 < hardship_2016 & hardship_2016 > hardship_2019 ~ "Inverted-V (worsened mid)",
      TRUE ~ "Flat"
    )
  )

cat("── Three-point trend distribution (2013 | 2016 | 2019) ──\n")
trend_df %>% count(trend_type, sort = TRUE) %>%
  mutate(pct = round(100 * n / sum(n), 1)) %>% print()

change_df <- change_df %>%
  left_join(trend_df %>% select(GEOID, trend_type), by = "GEOID")
```

```{r}
#| label: step.07-trend-map
#| fig-height: 9
trend_colors <- c(
  "Consistently improving" = "#2166AC",
  "Consistently worsening" = "#D7191C",
  "V-shaped (improved mid)"               = "#74ADD1",
  "Inverted-V (worsened mid)"             = "#F46D43",
  "Flat"                   = "#CCCCCC"
)

ggplot(change_df) +
  geom_sf(aes(fill = trend_type), color = NA) +
  scale_fill_manual(values = trend_colors, na.value = "#EEEEEE",
                    name = "Trend (2013|2016|2019)") +
  theme_void() +
  labs(title    = "Three-Point Hardship Trend (2013 | 2016 | 2019)",
       subtitle = "Maricopa County Census Tracts")
```


---

# Question 2: Student Modify Code


```{r}
#| label: q2-lisa-strict
# YOUR CODE HERE — rerun classify_lisa() with p_thresh = 0.01
```

```{r}
#| label: q2-trajectory-strict
# YOUR CODE HERE — reclassify trajectories with strict LISA
```

```{r}
#| label: q2-map-strict
#| fig-height: 9
# YOUR CODE HERE — trajectory map at p < 0.01
```