---
title: Hacking GLMs to fit population growth models
author: Eric R. Scott
date: '2020-02-24'
slug: hacking-glms
categories:
  - Blog
tags:
  - R
  - modeling
draft: no
math: yes
image:
  caption: ""
  focal_point: ""
---

I'm currently teaching Ecological Statistics and Data, a class I inherited from [Lee Brown](http://ase.tufts.edu/biology/labs/crone/cronies) and [Elizabeth Crone](https://ase.tufts.edu/biology/faculty/crone/). In a lecture on population dynamics, they do some really cool things with generalized linear model---things that I don't think are standard practice and as far as I can tell from googling, aren't well documented. And let me tell you, I did a **lot** of googling to make sure I understood this stuff before teaching it. So, I thought I'd put it up on the blog for others.

# Data

I'll be using data on the Northern Rocky Mountain grey wolf population.  You can read more about the history of these wolves [here](https://www.justice.gov/enrd/northern-rocky-mountain-gray-wolves).

```{r message=FALSE}
library(tidyverse)
wolves <- read_csv("NRMwolves.csv") %>%
  mutate(year_post = year - 1982) 
head(wolves)
```


```{r echo=FALSE}
wolfplot <-
  ggplot(wolves, aes(x = year, y = num.wolves)) +
  geom_point() 
wolfplot
```

# Exponential growth

Exponential growth describes unregulated reproduction and is described by the equation:

$$
N_{T} = \lambda^TN_0
$$

where $\lambda$ is the population growth rate, $T$ is a number of time steps (e.g. years) and $N_0$ is the population at some initial time.


## Hacking an exponential growth rate GLM

We can take advantage of a log-link to linearize this equation:

$$
log(N_T) = log(N_0) + log(\lambda)\times T
$$

Compare to a generic GLM equation with a log-link:

$$
log(y)= \beta_0 + \beta_1x_1
$$

Here's the glm for an exponential growth model fit to the wolf data:

```{r}
m_exp <- glm(num.wolves ~ year_post, 
             family = poisson(link = "log"), data = wolves)
exp(coef(m_exp))
```

The backtransformed intercept is the estimate for $N_0$, the estimated number of wolves at `year_post` = 0

The backtransformed coefficient for `year_post` = $\lambda$, the population growth rate

```{r echo=FALSE}
wolfplot +
  geom_line(aes(y = predict(m_exp, type = "response")), color = "darkgreen")
```

# Process error

The exponential growth model fit above is an observation error model.  It assumes variation from predicted values are due to inaccuracies in estimating the number of wolves.

A process error model estimates a population growth that depends on current population size.  This can be modeled as a rate, $N_{t+1}/N_{t}$.

## Hacking a process error GLM

Again, we can use a log-link to linearize this:

$$
log(N_{t+1}/N_{t}) = \beta_0
$$

$$
log(N_{t+1}) +log(N_{t}) = \beta_0
$$
$$
log(N_{t+1}) = \beta_0 + log(N_{t})
$$

The $log(N_t)$ term, which has no coefficient associated with it, is an **offset**.  We can hack a glm to fit this model like so:

```{r}
wolves2 <- wolves %>%
  mutate(num.prev = lag(num.wolves)) %>% #create a column of lagged wolf numbers
  filter(!is.na(num.prev))

m_process <- glm(num.wolves ~ 1, offset = log(num.prev),
                 family = poisson(link = "log"), data = wolves2)
```

The backtransformed intercept is the yearly rate of increase ($N_{t+1}/N_t$)

```{r}
exp(coef(m_process))
```

$$
N_{t+1} = e^{\beta_0} \times N_{t}
$$

So, if there are 13 wolves in 1985, how many would it predict in 1986?

```{r}
1.108238 * 13
```

```{r echo=FALSE}
ggplot(wolves2, aes(x = year, y = num.wolves)) +
  geom_point() +
  geom_line(aes(y = predict(m_process, type = "response")), color = "orange")
```

# Ricker model

Finally, the most complicated, possibly mind blowing example of hacking a GLM.  This one took me quite a while to wrap my head around.

A Ricker model takes carrying capacity into account and allows growth rate to change as the population increases. It approximates logistic growth.

$$
N_{t+1}=N_{t}e^{r\left(1-{\frac {N_{t}}{K}}\right)}
$$
where $r = ln(\lambda)$ and $K$ is the carrying capacity

## Hacking a Ricker model GLM

Linearizing using a log-link (please tell me if I got the math wrong in the comments):


$$
log(N_{t+1})=log(N_{t}) + log\left( (e^r)^{\left(1-{\frac {N_{t}}{K}}\right)}\right)
$$

$$
log(N_{t+1})=log(N_{t}) + log (e^r)\times{\left(1-{\frac {N_{t}}{K}}\right)}
$$

$$
log(N_{t+1})= r-\frac{r}{K}N_{t} + \textrm{offset}[log(N_{t})]
$$

We can model this with the following GLM:

```{r}
m_rick <- glm(num.wolves ~ num.prev, offset = log(num.prev),
              family = poisson, data = wolves2)
```


$r = \beta_0$

```{r}
coef(m_rick)[1]
```

$\beta_1 = -r/K$

```{r}
coef(m_rick)[2]
```

Which means that $K = -\beta_0/\beta_1$

```{r}
-coef(m_rick)[1]/coef(m_rick)[2]
```

```{r}
emo::ji("exploding_head")
```

```{r echo=FALSE}
ggplot(wolves2, aes(x = year, y = num.wolves)) +
  geom_point() +
  geom_line(aes(y = predict(m_rick, type = "response")), color = "brown")
```

- [Download .Rmd file](https://raw.githubusercontent.com/Aariq/website-v2/master/content/post/2020-02-24-hacking-glms/index.Rmd)
- [Download data](https://raw.githubusercontent.com/Aariq/website-v2/master/content/post/2020-02-24-hacking-glms/NRMwolves.csv)