--- layout: post title: "Fitting variogram models in gstat" date: 2016-02-14 11:00:00 +0100 comments: true author: Edzer Pebesma categories: r --- DOWNLOADHERE Fitting variogram functions with R package [gstat](https://cran.r-project.org/package=gstat) has become more flexible, and hopefully more user friendly. Up to now, after loading data ```{r} library(sp) demo(meuse, ask = FALSE, echo = FALSE) # load meuse data set ``` users were required to use a sequence like ```{r} library(gstat) v = variogram(log(zinc)~1, meuse) v.fit = fit.variogram(v, vgm(1, "Sph", 900, 1)) v.fit ``` where `fit.variogram` fits variogram parameters of a spherical model (`Sph`) to the sample variogram `v`. The values 1, 900 and 1 were needed as initial values in the weighted non-linear fit (where only the range parameter is non-linear). This has changed in gstat version 1.2: now, `vgm` can take only a variogram model, as in ```{r} fit.variogram(v, vgm("Sph")) ``` or even a set of models, in which case the best fitting is returned, as in ```{r} fit.variogram(v, vgm(c("Exp", "Sph"))) fit.variogram(v, vgm(c("Exp", "Mat", "Sph"))) ``` where we still see that the sperical model is chosen. If we choose a different sample variogram, where Matern is chosen, as in: ```{r} v0 = variogram(zinc~1, meuse) fit.variogram(v0, vgm(c("Exp", "Mat", "Sph"))) ``` we see that the kappa value is 0.5, which is a default value that was not fit. We can fit kappa by specifying `fit.kappa = TRUE`, as in ```{r} options(warn = -1) # don't print warnings fit.variogram(v0, vgm(c("Exp", "Mat", "Sph")), fit.kappa = TRUE) ``` where the best fitting kappa from the range 0.3, 0.4, 0.5,...,5 is chosen. I suppressed warnings here, as around 20 warnings were printed in cases with crazy initial values. This is usual for Matern models: larger kappa values have effective ranges (the distance value at which the model reaches, say, 95% of its sill) much larger than the range parameter, as illustrated by ```{r fig=TRUE, fig.path = "images/", label="vgm1"} plot(variogramLine(vgm(1, "Mat", 1, kappa = 4), 10), type = 'l') ``` where at distance 1, 0.05 of the sill is reached (and the model, up till there, is nearly linear or parabolic, leading to a singularity during fit). A different parameterisation of the Matern model, given in [Michael Stein's book](https://www.springer.com/gp/book/9780387986296), is the following ```{r fig=TRUE, fig.path = "images/", label="vgm2"} plot(variogramLine(vgm(1, "Ste", 1, kappa = 4), 10), type = 'l') ``` This one has the same smoothness, but reaches the sill much closer to the range value. As a consequence it fits easier, that is, without warnings: ```{r} options(warn = 0) # set back to normal fit.variogram(v0, vgm(c("Exp", "Ste", "Sph")), fit.kappa = TRUE) ``` For those you need a more precise estimate of the optimal kappa value, you can iterate over steps of e.g. 0.01 by ```{r} fit.variogram(v0, vgm(c("Exp", "Ste", "Sph")), fit.kappa = seq(.3,5,.01)) ``` ## How it works Default initial parameter values are chosen from the sample variogram, where: * the range parameter is taken as 1/3 of the maximum sample variogram distance, * the nugget parameter is taken as the mean of the first three sample variogram values, and * the partial sill is taken as the mean of the last five sample variogram values. ```{r} vgm("Sph") ``` contains `NA` values for the numeric parameters, and under the hood (undocumented) ```{r} gstat:::vgm_fill_na(vgm("Sph"), v) ``` fills the `NA` values with the initial values. Providing more than one model to `vgm` returns a list, ```{r} vgm(c("Sph", "Exp")) ``` which `fit.variogram` iterates over, returning the best fitting model. ## Comparison to automap Function `automap::autofitVariogram` does a similar job, but includes the computation of the sample variogram from data (which can be controlled by passing parameters to `...`). It takes slightly different defaults for fitting, definitely different defaults when computing the sample variogram, and has options for combining distance bins.