---
title: "Course Introduction"
author: "Dr. Jin Zhou @ UCLA"
date: "Apr 4, 2023"
output:
# ioslides_presentation: default
html_document:
toc: true
toc_depth: 4
subtitle: Biostat 200C
---
## Course webpages
- Github site:
+ slides, hw, announcements
- Burinlearn:
+ announcements and hw submissions
## What's this course about?
- In 200B, we learn the linear models in the form $$
y = \beta_0 + \beta_1 x_1 + \cdots + \beta_p x_p + \epsilon,
$$ where
1. $y$ is a continuous *response variable* (or *dependent variable*),\
2. $x_1, \ldots, x_p$ are *covariates* (or *predictors*, or *independent variables*), and\
3. $\epsilon$ is the *error term* and assumed to be normally distributed and independent among observations.
- In 200C, we generalize the linear models in three directions.
1. **Generalized linear models (GLMs)** handles nonnormal responses, $y$. \
* binary response (disease or not)
* proportions
* counts
2. **Mixed effects models** relaxes the independence assumption of the error term and allows correlation structure in $\epsilon$.\
* Some data has a grouped, nested or hierarchical structure.
* Repeated measures, longitudinal and multilevel data
3. **Nonparametric regression models** (GAM, trees, neural networks) allow nonlinearity in the effects of predictors $x$ on responses.
## Course description
- Read [syllabus](https://ucla-biostat-200c.github.io/2023spring/syllabus/syllabus.html) and [schedule](https://ucla-biostat-200c.github.io/2023spring/schedule/schedule.html) for a tentative list of topics and course logistics.
- Teaching philosophy. Usually a GLM course is taught on blackboard/whiteboard with mostly math derivations. In this course, I will start from R code and then explain the theory behind it.