---
title: "Worksheet for Lab 6"
author: "NAME"
date: "`r Sys.Date()`"
output: html_document
---

## Introduction

With this worksheet, you can work on and run the chunks of code you need for different exercises.  Chunks of code appear between the three backquote marks and have a different color, like this:

```{r chunk_name}
2 + 3
```

## Preliminaries

Run this chunk first!  This chunk of code loads the packages and datasets you need for this activity.

```{r setup}
library(tidyverse)
library(infer)

kobe <- read_csv("https://raw.githubusercontent.com/gregcox7/StatLabs/main/data/kobe.csv")
milgram <- read_csv("https://raw.githubusercontent.com/gregcox7/StatLabs/main/data/milgram.csv")
milgram_replication <- read_csv("https://raw.githubusercontent.com/gregcox7/StatLabs/main/data/milgram_replication.csv")
```

Within the braces at the beginning of each chunk, the letter `r` says that the code is written in the "R" language and the text after the little `r` is a helpful label for the chunk.

To run a chunk of code, press the green arrow button in the upper right of the chunk.  The result of running the chunk will appear immediately below it.  You can run a chunk more than once.  If you need to change anything or get an error, you can always edit your chunk and try again.

## Exercise 6.4

Use the chunk of code below to generate 1000 simulated datasets based on Milgram's study and visualize the 95% confidence interval.

```{r milgram_original_conf}
milgram_boot_dist <- ___ %>%
    specify(response = ___, success = "___") %>%
    generate(reps = ___, type = "___") %>%
    calculate(stat = "___")

milgram_ci <- milgram_boot_dist %>%
    get_confidence_interval(level = ___)

milgram_boot_dist %>%
    visualize() +
    shade_confidence_interval(endpoints = milgram_ci)
```

Based on the confidence interval you found (you can see the exact numbers if you click on `milgram_ci` in R's **Environment** window after running your code), what can you say about the proportion of people in general who would be willing to risk someone's life out of obedience?

## Exercise 6.5

Fill in the blanks in the following chunk to calculate the proportion of participants in Burger's replication who continue to obey the experimenter.

```{r milgram_replication_stat}
milgram_replication %>%
    specify(response = ___, success = "___") %>%
    calculate(stat = "___")
```

a. Does this proportion fall inside or outside the 95% confidence interval you found for Milgram's original experiment (in the previous exercise)?
b. What does your answer to part [a] suggest about whether attitudes toward obedience have changed between 1963 and 2009?  Explain your reasoning.

## Exercise 6.6

Fill in the blanks in the chunk of code to repeat the analysis you performed on Milgram's original data (called `milgram`) with the more recent replication (called `milgram_replication`).  *Hint:* As usual, you will be able to re-use a lot of code!

```{r milgram_replication_conf}
milgram_replication_boot_dist <- ___ %>%
    specify(response = ___, success = "___") %>%
    generate(reps = ___, type = "___") %>%
    calculate(stat = "___")

milgram_replication_ci <- milgram_replication_boot_dist %>%
    get_confidence_interval(level = ___)

milgram_replication_boot_dist %>%
    visualize() +
    shade_confidence_interval(endpoints = milgram_replication_ci)
```

a. What is the 95% confidence interval you found?  (You can see the raw numbers by clicking on `milgram_replication_ci` after running your code.)
b. This is how Milgram described his sample: *"The subjects were 40 males between the ages of 20 and 50, drawn from New Haven [Connecticut] and the surrounding communities...subjects were postal clerks, high school teachers, salesmen, engineers, and laborers. Subjects ranged in educational level from one who had not finished elementary school, to those who had doctorate and other professional degrees."*  Burger's sample also includes a range of educational experiences, but is comprised of 18 men and 22 women who range in age between 20 and 81 years old.  Do you think the 95% CI from Milgram's original study or the one from Burger's replication gives a better description of where the true population proportion probably is?  Explain your reasoning.

## Exercise 6.7

Fill in the blanks in the code below to use random permutation to test the *null hypothesis* that men and women *do not differ* in the proportion who obey.

* For the `specify` lines:  The **response variable** is still `obeyed`, but now we have an **explanatory variable** which is labeled `gender`.
* For the blanks in the `hypothesize`, `generate`, and `calculate` lines: look up how we set those in the previous activity.
* For the `direction` lines:  There are three possible settings: `direction = "less"`, `direction = "greater"`, or `direction = "two-sided"`.  Set it to the `direction` in which results would be considered "extreme" if the null hypothesis were true.

```{r milgram_gender_test}
milgram_replication_null_dist <- milgram_replication %>%
    specify(___ ~ ___, success = "Yes") %>%
    hypothesize(null = "___") %>%
    generate(reps = 1000, type = "___") %>%
    calculate(stat = "___", order = c("Male", "Female"))

milgram_replication_obs_diff <- milgram_replication %>%
    specify(___ ~ ___, success = "Yes") %>%
    calculate(stat = "___", order = c("Male", "Female"))

milgram_replication_p_value <- milgram_replication_null_dist %>%
    get_p_value(obs_stat = milgram_replication_obs_diff, direction = "___")

milgram_replication_null_dist %>%
    visualize() +
    shade_p_value(obs_stat = milgram_replication_obs_diff, direction = "___")
```

Would you reject the null hypothesis?  (You can check out the $p$ value by clicking on `milgram_replication_p_value` in R's **Environment** window after running your code.)  What does the result of this hypothesis test say about whether men and women differ in their tendency to obey commands that potentially endanger someone's life?