What is a density surface model? ================================ css: custom.css transition: none ```{r setup, include=FALSE} # setup library(knitr) library(magrittr) library(viridis) opts_chunk$set(cache=TRUE, echo=FALSE, out.width='\\textwidth', fig.height=10, fig.width=10) ``` Why model abundance spatially? ============================== - Use non-designed surveys - Use environmental information - Maps Back to Horvitz-Thompson estimation =================================== type:section Horvitz-Thompson-like estimators ================================ - Rescale the (flat) density and extrapolate $$ \hat{N} = \frac{\text{study area}}{\text{covered area}}\sum_{i=1}^n \frac{s_i}{\hat{p}_i} $$ - $s_i$ are group/cluster sizes - $\hat{p}_i$ is the detection probability (from detection function) Hidden in this formula is a simple assumption ============================================= - Probability of sampling every point in the study area is equal - Is this true? Sometimes. - If (and only if) the design is randomised Many faces of randomisation =========================== ```{r randomisation, fig.width=14, fig.height=4.5, out.width='\\textwidth'} set.seed(12133) par(mfrow=c(1,3), cex.main=2.5) # true random sample plot(c(0, 1), c(0, 1), type="n", xlab="", ylab="", axes=FALSE, asp=1, main="random placement") dat <- data.frame(x=runif(10), y=runif(10)) angle <- runif(10, 0, 2*pi) len <- 0.2 arrows(dat$x, dat$y, dat$x+len*cos(angle), dat$y+len*sin(angle), length=0) dat <- data.frame(x=runif(10), y=runif(10)) angle <- runif(10, 0, 2*pi) len <- 0.2 arrows(dat$x, dat$y, dat$x+len*cos(angle), dat$y+len*sin(angle), length=0, col="grey40", lty=2) box() # parallel random offset plot(c(0, 1), c(0, 1), type="n", xlab="", ylab="", axes=FALSE, asp=1, main="random offset parallel lines") abline(v=seq(0, 1, len=10)) abline(v=seq(0, 1, len=10)+0.07, col="grey40", lty=2) box() # random offset zigzag ## make a zigzag n_segs <- 10 zz <- data.frame(x = c(seq(0, 0.5, len=n_segs), seq(0.5, 1, len=n_segs)), y = c(seq(0, 1, len=n_segs), seq(1, 0, len=n_segs))) # many zigzags mzz <- rbind(zz,zz,zz) mzz$x <- mzz$x/3 ind <- 1:nrow(zz) mzz$x[ind+nrow(zz)] <- mzz$x[ind+nrow(zz)]+1/3 mzz$x[ind+2*nrow(zz)] <- mzz$x[ind+2*nrow(zz)]+2/3 plot(mzz, type="l", xlab="", ylab="", axes=FALSE, asp=1, main="random offset zigzag") lines(mzz$x+0.06, mzz$y, col="grey40", lty=2) box() ``` Randomisation & coverage probability ===================================== - H-T equation above assumes even coverage - (or you can estimate) Extra information ======================================================== ```{r loadtracks, results="hide"} library(rgdal) tracksEN <- readOGR("../spermwhaledata/rawdata/Analysis.gdb", "EN_Trackline1") tracksGU <- readOGR("../spermwhaledata/rawdata/Analysis.gdb", "GU_Trackline") ``` ```{r plottracks, cache=FALSE} library(ggplot2) tracksEN <- fortify(tracksEN) tracksGU <- fortify(tracksGU) mapdata <- map_data("world2","usa") p_maptr <- ggplot()+ geom_path(aes(x=long,y=lat, group=group), colour="red", data=tracksEN) + geom_path(aes(x=long,y=lat, group=group), colour="blue", data=tracksGU)+ geom_polygon(aes(x=long,y=lat, group=group), data=mapdata)+ theme_minimal() + coord_map(xlim=range(tracksEN$long, tracksGU$long)+c(-1,1), ylim=range(tracksEN$lat, tracksGU$lat)+c(-1,1)) print(p_maptr) ``` Extra information - depth ========================== ```{r loadcovars, results="hide"} library(raster) predictorStack <- stack(c("../spermwhaledata/rawdata/Covariates_for_Study_Area/Depth.img", "../spermwhaledata/rawdata/Covariates_for_Study_Area/GLOB/CMC/CMC0.2deg/analysed_sst/2004/20040601-CMC-L4SSTfnd-GLOB-v02-fv02.0-CMC0.2deg-analysed_sst.img","../spermwhaledata/rawdata/Covariates_for_Study_Area/VGPM/Rasters/vgpm.2004153.hdf.gz.img")) names(predictorStack) <- c("Depth","SST","NPP") ``` ```{r plotdepth} load("../spermwhaledata/R_import/spermwhale.RData") depthdat <- as.data.frame(predictorStack[[1]],xy=TRUE) depthdat <- depthdat[!is.na(depthdat$Depth),] library(plyr) plotobs <- join(obs, segs, by="Sample.Label") p <- ggplot() + geom_tile(aes(x=x, y=y, fill=Depth), data=depthdat) + geom_point(aes(x=x, y=y, size=size), alpha=0.6, data=plotobs) + coord_equal() + scale_fill_viridis() + theme_minimal() print(p) ``` Extra information - depth ========================== ```{r plotdepth-notspat, fig.width=10} p <- ggplot(plotobs)+ geom_histogram(aes(Depth, weight=size)) + xlab("Depth") + ylab("Aggregated counts") + theme_minimal() print(p) ``` *** - NB this only shows segments where counts > 0 Extra information - SST ============================================ ```{r plotsst, fig.width=10} sstdat <- as.data.frame(predictorStack[[2]], xy=TRUE) sstdat <- sstdat[!is.na(sstdat$SST),] p <- ggplot() + geom_tile(aes(x=x, y=y, fill=SST), data=sstdat) + geom_point(aes(x=x, y=y, size=size), alpha=0.6, data=plotobs) + coord_equal() + scale_fill_viridis() + theme_minimal() print(p) ``` Extra information - SST ============================================ ```{r plotsst-notspat, fig.width=10} p <- ggplot(plotobs)+ geom_histogram(aes(SST, weight=size), binwidth=1) + xlab("SST") + ylab("Aggregated counts") + theme_minimal() print(p) ``` *** - (only segments where counts > 0) You should model that ===================== type: section Modelling outputs =================== - Abundance and uncertainty - Arbitrary areas - Numeric values - Maps - Extrapolation (with caution!) - Covariate effects - count/sample as function of covars Modelling requirements ======================== - Include detectability - Account for effort - Flexible/interpretable effects - Predictions over an arbitrary area Accounting for effort ====================== type:section Effort ========== ```{r tracks2, fig.width=10} print(p_maptr) ``` *** - Have transects - Variation in counts and covars along them - Want a sample unit w/ minimal variation - "Segments": chunks of effort Chopping up transects ====================== Physeter catodon by Noah Schlottman [Physeter catodon by Noah Schlottman](http://phylopic.org/image/dc76cbdb-dba5-4d8f-8cf3-809515c30dbd/) Flexible, interpretable effects ================================ type:section Smooth response ================ ```{r plotsmooths, messages=FALSE} library(Distance) library(dsm) df <- ds(dist, truncation=6000) dsm_tw_xy_depth <- dsm(count ~ s(x, y) + s(Depth), ddf.obj=df, observation.data=obs, segment.data=segs, family=tw()) plot(dsm_tw_xy_depth, select=2) ``` Explicit spatial effects ============================ ```{r plot-spat-smooths, messages=FALSE} vis.gam(dsm_tw_xy_depth, view=c("x","y"), plot.type="contour", main="", asp=1, too.far=0.06) ``` Predictions ================================ type:section Predictions over an arbitrary area =================================== ```{r predplot} predgrid$Nhat <- predict(dsm_tw_xy_depth, predgrid) p <- ggplot(predgrid) + geom_tile(aes(x=x, y=y, fill=Nhat, width=10*1000, height=10*1000)) + coord_equal() + labs(fill="Density") + scale_fill_viridis() + theme_minimal() print(p) ``` *** - Don't want to be restricted to predict on segments - Predict within survey area - Extrapolate outside (with caution) - Working on a grid of cells Detection information ================================ type:section Including detection information ================================= - Two options: - adjust areas to account for **effective effort** - use **Horvitz-Thompson estimates** as response Effective effort ================ - Area of each segment, $A_j$ - use $A_j\hat{p}_j$ - think effective strip width ($\hat{\mu} = w\hat{p}$) - Response is counts per segment - "Adjusting for effort" - "Count model" Estimated abundance =================== - Estimate H-T abundance per segment - Effort is area of each segment - "Estimated abundance" per segment $$ \hat{n}_j = \sum_{i \text{ in segment } j } \frac{s_i}{\hat{p}_i} $$ Detectability and covariates ============================= - 2 covariate "levels" in detection function - "Observer"/"observation" -- change **within** segment - "Segment" -- change **between** segments - "Count model" only lets us use segment-level covariates - "Estimated abundance" lets us use either When to use each approach? ============================ - Generally "nicer" to adjust effort - Keep response (counts) close to what was observed - **Unless** you want observation-level covariates - These *can* make a big difference! Availability, perception bias and more ============================= - $\hat{p}$ is not always simple! - Availability & perception bias somehow enter - We can make explicit models for this - More later in the course DSM flow diagram ================== DSM process flow diagram Spatial models =============== type: section Abundance as a function of covariates ======================================= - Two approaches to model abundance - Explicit spatial models - When: good coverage, fixed area - "Habitat" models (no explicit spatial terms) - When: poorer coverage, extrapolation - We'll cover both approaches here Data requirements ===================================== type:section What do we need? =================== - Need to "link" data - Distance data/detection function - Segment data - Observation data to link segments to detections Example of spatial data in QGIS ===================================== type:section Recap ====== - Model counts or estimated abundace - The effort is accounted for differently - Flexible models are good - Incorporate detectability - 2 tables + detection function needed