Additional text and code is highlighted using boxes like this.

We'll skip this and go straight to summarising all our models so far in the next section...

```{r var-table} # make a list of models model_list <- list(dsm_nb_xy, dsm_nb_x_y, dsm_nb_xy_ms, dsm_nb_x_y_ms, dsm_tw_xy, dsm_tw_x_y, dsm_tw_xy_ms, dsm_tw_x_y_ms) # give the list names for the models, so we can identify them later names(model_list) <- c("dsm_nb_xy", "dsm_nb_x_y", "dsm_nb_xy_ms", "dsm_nb_x_y_ms", "dsm_tw_xy", "dsm_tw_x_y", "dsm_tw_xy_ms", "dsm_tw_x_y_ms") per_model_var <- ldply(model_list, summarize_dsm_var, predgrid=predgrid_var) ```

We can use a similar technique as we did in the prediction exercises to get coefficient of variation maps for all of the models... **Note this can take a long time!** ```{r predsp-allgen, fig.width=15, fig.height=7} # make a function that calculates the CV, adds them to a column named CV # and adds a column called "model" that stores the model name, then returns the # data.frame. make_cv_dat <- function(model_name, predgrid_split, predgrid){ # we use get() here to grab the object with the name of its argument var_obj <- dsm.var.prop(get(model_name), predgrid_split, off.set=predgrid$off.set) predgrid[["CV"]] <- sqrt(var_obj$pred.var)/unlist(var_obj$pred) predgrid[["model"]] <- model_name return(predgrid) } # load plyr and apply to a list of the names of the models, make_cv_dat returns # a data.frame (hence this is an "ld" function: list->data.frame) that it then binds # together library(plyr) big_cv <- ldply(list("dsm_nb_xy", "dsm_nb_x_y", "dsm_nb_xy_ms", "dsm_nb_x_y_ms", "dsm_tw_xy", "dsm_tw_x_y", "dsm_tw_xy_ms", "dsm_tw_x_y_ms"), make_cv_dat, predgrid_split=predgrid_var_split, predgrid=predgrid_var_map) ``` ```{r predsp-allplot} p <- ggplot(big_cv) + geom_tile(aes(x=x, y=y, fill=CV, width=10*1000, height=10*1000)) + scale_fill_viridis() + coord_equal() + facet_wrap(~model, nrow=2) + geom_path(aes(x=long,y=lat, group=group), data=tracks) print(p) ``` One issue with producing this kind of plot is that `ggplot2` will use a common legend, meaning if there are some cells with high values, this can throw out the rest of the scale. As suggested in the lectures, we'll now use `cut()` to create a categorised version of the CV and plot that instead: ```{r predsp-allplot-cut} big_cv$CV_cut <- cut(big_cv$CV, c(seq(0,2,0.2),3:6)) p <- ggplot(big_cv) + geom_tile(aes(x=x, y=y, fill=CV_cut, width=10*1000, height=10*1000)) + scale_fill_viridis(discrete=TRUE) + coord_equal() + facet_wrap(~model, nrow=2) + geom_path(aes(x=long,y=lat, group=group), data=tracks) print(p) ``` This is a bit easier to read.

Trying a North-South split about $y=0$... ```{r split-var} # create a data frame predgrid_var_ns <- list() predgrid_var_ns[["North"]] <- predgrid_var[predgrid_var$y>0, ] predgrid_var_ns[["South"]] <- predgrid_var[predgrid_var$y<=0, ] # get the offsets ns_offsets <- c(sum(predgrid_var[predgrid_var$y<=0, ]), sum(predgrid_var[predgrid_var$y>0, ])) # calculate the variance per region var_split_ns <- dsm.var.prop(dsm_nb_xy_ms, predgrid_var_ns, off.set=ns_offsets) ``` We can then pull out the CVs: ```{r split-cv} sqrt(var_split_ns$pred.var)/unlist(var_split_ns$pred) ``` So the northern region has a lower CV than the southern region, which we expect given the distribution of effort.