---
title: Summary of linear regression
subtitle: Biostat/Biomath M257
author: Dr. Hua Zhou @ UCLA
date: today
format:
  html:
    theme: cosmo
    embed-resources: true
    number-sections: true
    toc: true
    toc-depth: 4
    toc-location: left
    code-fold: false
jupyter:
  jupytext:
    formats: 'ipynb,qmd'
    text_representation:
      extension: .qmd
      format_name: quarto
      format_version: '1.0'
      jupytext_version: 1.14.5
  kernelspec:
    display_name: Julia (8 threads) 1.8.5
    language: julia
    name: julia-_8-threads_-1.8
---

System information (for reproducibility):

```{julia}
versioninfo()
```

Load packages:

```{julia}
using Pkg

Pkg.activate(pwd())
Pkg.instantiate()
Pkg.status()
```

## Comparing methods for linear regression

Methods for solving linear regression $\widehat \beta = (\mathbf{X}^T \mathbf{X})^{-1} \mathbf{X}^T \mathbf{y}$:

| Method            | Flops                  | Remarks                 | Software | Stability   |
| :---------------: | :--------------------: | :---------------------: | :------: | :---------: |
| Sweep             | $np^2 + p^3$           | $(X^TX)^{-1}$ available | SAS      | less stable |
| Cholesky          | $np^2 + p^3/3$         |                         |          | less stable |
| QR by Householder | $2np^2 - (2/3)p^3$     |                         | R        | stable      |
| QR by MGS         | $2np^2$                | $Q_1$ available         |          | stable      | 
| QR by SVD         | $4n^2p + 8np^2 + 9p^3$ | $X = UDV^T$             |          | most stable |  

Remarks:

1. When $n \gg p$, sweep and Cholesky are twice faster than QR and need less space.  
2. Sweep and Cholesky are based on the **Gram matrix** $\mathbf{X}^T \mathbf{X}$, which can be dynamically updated with incoming data. They can handle huge $n$, moderate $p$ data sets that cannot fit into memory.  
3. QR methods are more stable and produce numerically more accurate solution.  
4. Although sweep is slower than Cholesky, it yields standard errors and so on.  
5. MGS appears slower than Householder, but it yields $\mathbf{Q}_1$.

> **There is simply no such thing as a universal 'gold standard' when it comes to algorithms.**

## Benchmark

```{julia}
using SweepOperator, BenchmarkTools, LinearAlgebra

linreg_cholesky(y::Vector, X::Matrix) = cholesky!(X'X) \ (X'y)

linreg_qr(y::Vector, X::Matrix) = X \ y

function linreg_sweep(y::Vector, X::Matrix)
    p = size(X, 2)
    xy = [X y]
    tableau = xy'xy
    sweep!(tableau, 1:p)
    return tableau[1:p, end]
end

function linreg_svd(y::Vector, X::Matrix)
    xsvd = svd(X)
    return xsvd.V * ((xsvd.U'y) ./ xsvd.S)
end
```

```{julia}
using Random

Random.seed!(123) # seed

n, p = 10, 3
X = randn(n, p)
y = randn(n)

# check these methods give same answer
@show linreg_cholesky(y, X)
@show linreg_qr(y, X)
@show linreg_sweep(y, X)
@show linreg_svd(y, X);
```

```{julia}
n, p = 1000, 300
X = randn(n, p)
y = randn(n)

@benchmark linreg_cholesky(y, X)
```

```{julia}
@benchmark linreg_sweep(y, X)
```

```{julia}
@benchmark linreg_qr(y, X)
```

```{julia}
@benchmark linreg_svd(y, X)
```