---
title: "Other Covariance Functions and Anisotropy"
output: pdf_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
knitr::opts_chunk$set(message = FALSE)
library(dplyr)
library(ggplot2)
library(knitr)
```


## Covariance Functions
Given the assumption that a Gaussian process is reasonable for the spatial process, a valid covariance function needs to be specified.

\vfill

Up to this point, we have largely worked with isotropic covariance functions. In particular, the exponential covariance functions has primarily been used.

\vfill

A valid covariance function $C(\boldsymbol{h})$, defined as $Cov(Y(\boldsymbol{s}),Y(\boldsymbol{s+h}))$, for any finite set of sites $\boldsymbol{s_1}, \dots, \boldsymbol{s_n}$ and $a_1, \dots, a_n$ should satisfy
$$Var\left[\sum_i a_i Y(\boldsymbol{s_i})\right]= \sum_{i,j} a_i a_j Cov(Y(\boldsymbol{s_i}),Y(\boldsymbol{s_j})) = \sum_{i,j} a_i a_j C(\boldsymbol{s_i}-\boldsymbol{s_j})\geq 0$$
with strict inequality if all the $a_i$ are not zero.
\vfill

In other words, $C(\boldsymbol{h})$ needs to be a positive definite function, which includes the following properties

\vfill

\vfill

## Constructing covariance functions

There are three approaches for building correlation functions. For all cases let $C_1, \dots, C_m$ be valid correlation functions:

\vfill

1. *Mixing:* 

\vfill

2. *Products:* 

\vfill

3. *Convolution:* 

\vfill

\newpage

## Smoothness
Many one-parameter isotropic covariance functions will be quite similar. Another consideration for choosing the correlation function is the

\vfill

The Matern class of covariance functions contains a parameter, $\nu,$ to control smoothness. 


\vfill

"Expressed in a different way, use of the Matern covariance function as a model enables the data to inform about $\nu$; we can learn about process smoothness despite observing the process at only a finite number of locations."

\vfill

## Matern Covariance Specification
The Matern covariance function is written as 
$$\frac{\sigma^2}{2^{\nu -1}\Gamma(\nu)} (\phi d)^\nu K_\nu(\phi d),$$
where $\Gamma()$ is a gamma function and $K_\nu$ is the modified Bessel function of order $\nu$.


\newpage

## Anisotropy

Anisotropy means that the covariance function is not just a function of the distance $||h||$,


\vfill

Geometric anisotropy refers to the case where the coordinate space is anisotropic, but can be transformed to an isotropic space.

\vfill

If the differences in spatial structure are directly related to two coordinate sets (lat and long), we can create a stationary, anistropic covariance function

\vfill

Let $$cor(Y(\boldsymbol{s + h}), Y(\boldsymbol{s})) = \rho_1(h_y) \rho_2(h_x),$$
where $\rho_1()$ and $\rho_2()$ are proper correlation functions.

\vfill

In general consider the correlation function,
$$\rho(\boldsymbol{h}; \phi) = \phi_0(||L\boldsymbol{h}||; \phi)$$

\vfill

Let $\boldsymbol{Y}(\boldsymbol{s}) = \mu(\boldsymbol{s}) + w(\boldsymbol{s}) + \epsilon(\boldsymbol{s})$, and $\boldsymbol{Y}(\boldsymbol{s}) \sim N(\mu(\boldsymbol{s}), \Sigma(\tau^2, \sigma^2, \phi, B))$, where $B = L^T L$.

\vfill

The covariance matrix is defined as $\Sigma(\tau^2, \sigma^2, \phi, B)) = \tau^2 I + \sigma^2 H((\boldsymbol{h}^T B \boldsymbol{h})^{\frac{1}{2}}),$ where $H((\boldsymbol{h}^T B \boldsymbol{h})^{\frac{1}{2}})$ has entries of $\rho((\boldsymbol{h_{ij}}^T B \boldsymbol{h_{ij}})^{\frac{1}{2}})$ with $\rho()$ being a valid covariance function, typically including $\phi$ and $\boldsymbol{h_{ij}} = \boldsymbol{s_i} - \boldsymbol{s_j}$.

\vfill

\vfill


\vfill

\newpage


## Geometric Anisotropy Visual
- Consider four points positioned on a unit circle.
```{r, fig.width=4, fig.height = 4, fig.align = 'center'}
x = c(-1, 0, 0, 1)
y = c(0, -1, 1, 0)
gg_circle <- function(r, xc, yc, color="black", fill=NA, ...) {
    x <- xc + r*cos(seq(0, pi, length.out=100))
    ymax <- yc + r*sin(seq(0, pi, length.out=100))
    ymin <- yc + r*sin(seq(0, -pi, length.out=100))
    annotate("ribbon", x=x, ymin=ymin, ymax=ymax, color=color, fill=fill, ...)
}

data.frame(x=x, y=y) %>% ggplot(aes(x=x,y=y))  + gg_circle(r=1, xc=0, yc=0, color = 'gray') + geom_point(shape = c('1','2','3','4'), size=5) + theme_minimal()

```

\vfill

Now consider a set of correlation functions. For each, calculate the correlation matrix and discuss the impact of $B$ on the correlation. Furthermore, how does B change the geometry of the correlation between points 1, 2, 3, and 4?

\vfill

1. $\rho() = \exp(-(\boldsymbol{h_{ij}}^T B \boldsymbol{h_{ij}})^{\frac{1}{2}}),$ where $B = \begin{pmatrix}
1 & 0 \\
0 & 1 \\
\end{pmatrix}$

\vfill

2. $\rho() = \exp(-(\boldsymbol{h_{ij}}^T B \boldsymbol{h_{ij}})^{\frac{1}{2}}),$ where $B = \begin{pmatrix}
2 & 0 \\
0 & 1 \\
\end{pmatrix}$

\vfill

3. $\rho() = \exp(-(\boldsymbol{h_{ij}}^T B \boldsymbol{h_{ij}})^{\frac{1}{2}}),$ where $B = \begin{pmatrix}
3 & 1 \\
1 & 1 \\
\end{pmatrix}$

\vfill

\newpage


```{r}
h.x <- matrix(0, 4, 4)
h.y <- matrix(0, 4, 4)
for (i in 1:4){
  for (j in 1:4){
    h.x[i,j] <- x[i] - x[j]
    h.y[i,j] <- y[i] - y[j]
  }
}
```

1. $\rho() = \exp(-(\boldsymbol{h_{ij}}^T I \boldsymbol{h_{ij}})^{\frac{1}{2}})$
```{r}

cor.mat <- matrix(0, 4, 4)
for (i in 1:4){
  for (j in 1:4){
    cor.mat[i,j] <- exp(- sqrt(t(c(h.x[i,j], h.y[i,j])) %*% diag(2)  %*% (c(h.x[i,j], h.y[i,j]))) )
  }
}
cor.mat %>% kable(digits = 3)


```

\vfill
2. $\rho() = \exp(-(\boldsymbol{h_{ij}}^T B \boldsymbol{h_{ij}})^{\frac{1}{2}}),$ where $B = \begin{pmatrix}
2 & 0 \\
0 & 1 \\
\end{pmatrix}$
```{r}
cor.mat <- matrix(0, 4, 4)
for (i in 1:4){
  for (j in 1:4){
    cor.mat[i,j] <- exp(- sqrt(t(c(h.x[i,j], h.y[i,j])) %*% matrix(c(2,0,0,1),2,2)  %*% (c(h.x[i,j], h.y[i,j]))) )
  }
}
cor.mat %>% kable(digits = 3)
```

\vfill

3. $\rho() = \exp(-(\boldsymbol{h_{ij}}^T B \boldsymbol{h_{ij}})^{\frac{1}{2}}),$ where $B = \begin{pmatrix}
3 & 1 \\
1 & 1 \\
\end{pmatrix}$
```{r}
cor.mat <- matrix(0, 4, 4)
for (i in 1:4){
  for (j in 1:4){
    cor.mat[i,j] <- exp(- sqrt(t(c(h.x[i,j], h.y[i,j])) %*% matrix(c(3,1,1,1),2,2)  %*% (c(h.x[i,j], h.y[i,j]))) )
  }
}
cor.mat %>% kable(digits = 3)
```

\vfill

The (effective) range for any angle $\eta$ is determined by the equation
$$\rho(r_\eta(\tilde{\boldsymbol{h}}_{\eta}^T B \tilde{\boldsymbol{h}}_{\eta}^T)^{\frac{1}{2}}) = .05,$$
where $\tilde{\boldsymbol{h}}_{\eta}$ is a unit vector in the direction $\eta$.

\vfill
Okay, so if we suspect that geometric anisotrophy is present, how do we fit the model? That is, what is necessary in estimating this model?
\vfill

\vfill

\newpage


## Sill, Nugget, and Range Anisotropy

Recall the sill is defined as $\lim_{d \rightarrow \infty} \gamma(d)$
\vfill

Let $\boldsymbol{h}$ be an arbitrary separation vector, that can be normalized as $\frac{\boldsymbol{h}}{||\boldsymbol{h}||}$
\vfill

If $\lim_{a \rightarrow \infty} \gamma(a \times \frac{\boldsymbol{h}}{||\boldsymbol{h}||})$ depends on $\boldsymbol{h}$, this is referred to as sill anisotropy.

\vfill

Similarly the nugget and range can depend on $\boldsymbol{h}$ and give nugget anisotropy and range anisotropy