---
title: "Econ 425T Homework 2"
subtitle: "Due Feb 3, 2023 @ 11:59PM"
author: "YOUR NAME and UID"
date: "`r format(Sys.time(), '%d %B, %Y')`"
format:
html:
theme: cosmo
number-sections: true
toc: true
toc-depth: 4
toc-location: left
code-fold: false
engine: knitr
knitr:
opts_chunk:
fig.align: 'center'
# fig.width: 6
# fig.height: 4
message: FALSE
cache: false
---
## Least squares is MLE (10pts)
Show that in the case of linear model with Gaussian errors, maximum likelihood and least squares are the same thing, and $C_p$ and AIC are equivalent.
## ISL Exercise 6.6.1 (10pts)
## ISL Exercise 6.6.3 (10pts)
## ISL Exercise 6.6.4 (10pts)
## ISL Exercise 6.6.5 (10pts)
## ISL Exercise 6.6.11 (30pts)
You must follow the [typical machine learning paradigm](https://ucla-econ-425t.github.io/2023winter/slides/06-modelselection/workflow_lasso.html) to compare _at least_ 3 methods: least squares, lasso, and ridge. Report final results as
| Method | CV RMSE | Test RMSE |
|:------:|:------:|:------:|:------:|
| LS | | | |
| Ridge | | | |
| Lasso | | | |
| ... | | | |
## ISL Exercise 5.4.2 (10pts)
## ISL Exercise 5.4.9 (20pts)
## Bonus question (20pts)
Consider a linear regression, fit by least squares to a set of training data $(x_1, y_1), \ldots, (x_N, y_N)$ drawn at random from a population. Let $\hat \beta$ be the least squares estimate. Suppose we have some test data $(\tilde{x}_1, \tilde{y}_1), \ldots, (\tilde{x}_M, \tilde{y}_M)$ drawn at random from the same population as the training data. If $R_{\text{train}}(\beta) = \frac{1}{N} \sum_{i=1}^N (y_i - \beta^T x_i)^2$ and $R_{\text{test}}(\beta) = \frac{1}{M} \sum_{i=1}^M (\tilde{y}_i - \beta^T \tilde{x}_i)^2$. Show that
$$
\operatorname{E}[R_{\text{train}}(\hat{\beta})] < \operatorname{E}[R_{\text{test}}(\hat{\beta})].
$$