--- title: "500 Lab 0 Template (replace these words)" subtitle: "Use your own subtitle, if you like." author: "Your Name Goes Here" date: last-modified format: html: toc: true number-sections: true code-fold: show code-tools: true code-overflow: wrap embed-resources: true date-format: iso theme: default ## change the theme if you prefer --- # R Setup {.unnumbered} ```{r} #| message: false #| warning: false knitr::opts_chunk$set(comment = NA) # do not remove this library(janitor) # load other packages as desired library(tidyverse) # load tidyverse last theme_set(theme_bw()) # set theme for ggplots ``` ## Loading the Data {.unnumbered} ```{r} url_0 <- "https://raw.githubusercontent.com/THOMASELOVE/500-data/master/data/lab0.csv" lab0 <- read_csv(url_0, show_col_types = FALSE) |> mutate(subject = as.character(subject), treatment = factor(treatment)) dim(lab0) ``` Note that we should have 135 rows and 7 columns here. It would be helpful at this point to look the data over a bit and understand what you have. # Task 1 > Build a logistic regression model using the main effects of the five predictors to predict **treatment** status. > Use R to add two columns to the data set, specifically the fitted probability (according to your logistic regression model) of being treated, and the linear component of the logistic regression model (the logit of the probability of being treated.) ## Fitting the model and storing the predictions ```{r} m1 <- glm((treatment=="Treated") ~ cov1 + cov2 + cov3 + cov4 + cov5, family=binomial(), data=lab0) lab0$linpred <- m1$linear.predictors lab0$prob <- m1$fitted.values lab0 # note new columns ``` # Task 2 > Next, summarize the resulting probabilities across the untreated and treated patients in an appropriate and attractive manner. > Of course, we’d expect that the average probability of being treated will be higher in the patients who are actually treated. Verify that this is the case, in a short **numerical** summary of your findings. ## Numerical Summary ```{r} # place your code to develop a numerical summary here ``` # Task 3 > How much overlap is there between the fitted probabilities of the treated patients and the fitted probabilities of the untreated patients? ## Graphical Summary ```{r} # place your code to develop a graphical summary here ``` # Session Information ```{r} xfun::session_info() ```