---
title: "Week 10: Activity"
format: gfm
editor: source
editor_options: 
  chunk_output_type: console
---


### Last Week's Recap

- GPs in 2D
- Bayesian inference with `stan`


### This week

- Fitting GP models


```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(tidyverse)
library(rstan)
library(mnormt)
library(shinystan)
library(plgp)
library(reshape2)
options(mc.cores = parallel::detectCores())
set.seed(03112025)
```



#### Conditional Normal distribution
Continuing the theme from last week, now there is one more location that we are interested in learning the temperature, maybe Rendezvous Ski Trails in West Yellowstone.

Let's assume that 
$$\begin{bmatrix} y_{bridger}\\ y_{big sky}\\ y_{rendezvous} \end{bmatrix} \sim N(\begin{bmatrix} 15 \\ 15 \\ 15\end{bmatrix}, 100\begin{bmatrix} 1 & .3 & .2 \\ .3 & 1 & .5 \\ .2 & .5 & 1
\end{bmatrix})$$

###### 1. Simulate one data point for Bridger and Big Sky

Simulate a single realization from the distribution for Bridger and Big Sky (ignoring Rendezvous for now).



###### 2. Estimate Rendezvous, conditional on the data point from Bridger and Big Sky



#### GP in 1D

Recall our simulated Gaussian process in 1D

```{r}
phi <- 1
sigmasq <- 1
n <- 50
x <- seq(0, 10, length.out = n)
d <- sqrt(plgp::distance(x))
eps <- sqrt(.Machine$double.eps) 
H <- exp(-d/phi) + diag(eps, n) 
y <- rmnorm(1, rep(0,n),sigmasq * H)
tibble(y = y, x = x) %>% ggplot(aes(y=y, x=x)) +
  theme_bw() + ggtitle('Random realization of a GP with phi = 1 and sigmasq = 1') +
  geom_point(size = .5)
```

We have simulated y ~ N(mu, sigmasq * H(phi)), where H(phi) is a correlation matrix from exp(-d/phi). 

##### STAN CODE 

Let's first explore stan code to estimate phi, sigmasq and mu

```
data {
  int<lower=0> N; // number of data points
  vector[N] y; // responds
  matrix[N,N] dist; // distance matrix
}

parameters {
  real<lower = 0.5, upper = 9.8> phi;
  real<lower = 0> sigmasq;
  real mu;
}

transformed parameters{
  vector[N] mu_vec;
  corr_matrix[N] Sigma;
  
  for(i in 1:N) mu_vec[i] = mu;
  for(i in 1:(N-1)){
   for(j in (i+1):N){
     Sigma[i,j] = exp((-1)*dist[i,j]/ phi);
     Sigma[j,i] = Sigma[i,j];
   }
 }
 
 for(i in 1:N) Sigma[i,i] = 1;

}

model {
  y ~ multi_normal(mu_vec ,sigmasq * Sigma);
  phi ~ inv_gamma(10,10);
  sigmasq ~ inv_gamma(10,10);
  mu ~ normal(0, 10);
}
```

Fit and summarize this model.



#### GP regression in 1D

Now add a covariate

```{r}
phi <- 1
sigmasq <- 1
n <- 50
x <- seq(0, 10, length.out = n)
beta <- 1
d <- sqrt(plgp::distance(x))
eps <- sqrt(.Machine$double.eps) 
H <- exp(-d/phi) + diag(eps, n) 
y <- rmnorm(1, x * beta,sigmasq * H)
reg_fig <- tibble(y = y, x = x) %>% ggplot(aes(y=y, x=x)) +
  theme_bw() + ggtitle('Random realization of a GP with phi = 1 and sigmasq = 1') +
  geom_point(size = .5)
reg_fig
```

We have simulated y ~ N(mu, sigmasq * H(phi)), where H(phi) is a correlation matrix from exp(-d/phi). 

##### STAN CODE 

```
data {
  int<lower=0> N; // number of data points
  vector[N] y; // response
  matrix[N,N] dist; // distance matrix
  vector[N] x; // covariate
}

parameters {
  real<lower = 0.5, upper = 9.8> phi;
  real<lower = 0> sigmasq;
  real beta;
}

transformed parameters{
  vector[N] mu_vec;
  corr_matrix[N] Sigma;
  
  for(i in 1:N) mu_vec[i] = x[i] * beta;
  for(i in 1:(N-1)){
   for(j in (i+1):N){
     Sigma[i,j] = exp((-1)*dist[i,j]/ phi);
     Sigma[j,i] = Sigma[i,j];
   }
 }
 for(i in 1:N) Sigma[i,i] = 1;

}

model {
  y ~ multi_normal(mu_vec ,sigmasq * Sigma);
  phi ~ inv_gamma(10,10);
  sigmasq ~ inv_gamma(10,10);
  beta ~ normal(0, 10);
}

```

Fit and summarize this model.


##### Making Predictions

For today, consider "plug in" estimates of phi, mu, and sigmasq. To make predictions from -1 to 11. Include both a mean and some measure of uncertainty.