spatialMaxent

Is an extension for Maxent version 3.4.4 built by Lisa Bald, Jannis Gottwald and Dirk Zeuss. For help on spatialMaxent please visit the spatialMaxent tutorial site: https://nature40.github.io/spatialMaxentPaper/

For help on Maxent version 3.4.4 see text below:

MaxEnt

A program for maximum entropy modelling of species geographic distributions, written by Steven Phillips, Miro Dudik and Rob Schapire, with support from AT&T Labs-Research, Princeton University, and the Center for Biodiversity and Conservation, American Museum of Natural History.  Thank you to the authors of the following free software packages which we have used here: ptolemy/plot, gui/layouts, gnu/getopt and com/mindprod/ledatastream.

This page contains reference information for the MaxEnt program.  Background information on the method can be found in the following two papers:

   Steven J. Phillips, Robert P. Anderson, Robert E. Schapire.
   Maximum entropy modeling of species geographic distributions.
   Ecological Modelling, Vol 190/3-4 pp 231-259, 2006.

   Steven J. Phillips, Miroslav Dudik.
   Modeling of species distributions with Maxent: new extensions and a comprehensive evaluation.
   Ecography, Vol 31 pp 161-175, 2008.

The model for a species is determined from a set of environmental or climate layers (or "coverages") for a set of grid cells in a landscape, together with a set of sample locations where the species has been observed.  The model expresses the suitability of each grid cell as a function of the environmental variables at that grid cell.  A high value of the function at a particular grid cell indicates that the grid cell is predicted to have suitable conditions for that species.  The computed model is a probability distribution over all the grid cells.  The distribution chosen is the one that has maximum entropy subject to some constraints: it must have the same expectation for each feature (derived from the environmental layers) as the average over sample locations.

Inputs, Outputs and Parameters

Input files, output directory and algorithm parameters can be specified through the user interface, or on a command line.  The user interface is best for doing single runs, while the command line is useful for repeated runs or automatically performing a sequence of runs with variations in the set of inputs. 

Inputs:

Algorithm Parameters:

Outputs:

All output files are written in the output directory. The summary of a maxent run is given in In addition, maxent produces several files for every species. For a species called mySpecies, it produces files The output format for predicted distributions is either raw,, logistic (the default) or cumulative. For raw output, the output values are probabilities (between 0 and 1) such that the sum over all cells used during training is 1. Typical values are therefore extremely small. For logistic output, the values are again probabilities (between 0 or 1), but scaled up in a non-linear way for easier interpretation. If typical presences used during training are from environmental conditions where probability of presence is around 0.5, then the logistic output can be interpreted as predicted probability of presence (otherwise they can be interpreted as relative suitability). If p(x) is the raw output for environmental conditions x, the corresponding logistic value is c p(x) / (1 + c p(x)) for a particular value of c (namely, the exponential of the entropy of the raw distribution). For the cumulative output format, the value at a grid cell is the sum of the probabilities of all grid cells with no higher probability than the grid cell, times 100.  For example, the grid cell that is predicted as having the best conditions for the species, according to the model, will have cumulative value 100, while cumulative values close to 0 indicate predictions of unsuitable conditions.

ESRI ASCII Grid Format

(Copied from the ArcWorkstation 8.3 Help File)

The ASCII file must consist of header information containing a set of keywords, followed by cell values in row-major order. The file format is

  <NCOLS xxx>
<NROWS xxx>
<XLLCENTER xxx | XLLCORNER xxx>
< YLLCENTER xxx | YLLCORNER xxx>
<CELLSIZE xxx>
{NODATA_VALUE xxx}
row 1
row 2
...
row n
where xxx is a number, and the keyword nodata_value is optional and defaults to -9999. Row 1 of the data is at the top of the grid, row 2 is just under row 1 and so on. For example:
  ncols         386
nrows 286
xllcorner -128.66338
yllcorner 13.7502065
cellsize 0.2
NODATA_value -9999
-9999 -9999 -123 -123 -123 -9999 -9999 -9999 -9999 -9999 ...
-9999 -9999 -123 -123 -123 -9999 -9999 -9999 -9999 -9999 ...
-9999 -9999 -117 -117 -117 -119 -119 -119 -119 -119 -9999 ...
...
The nodata_value is the value in the ASCII file to be assigned to those cells whose true value is unknown. Cell values should be delimited by spaces. No carriage returns are necessary at the end of each row in the grid. The number of columns in the header is used to determine when a new row begins. The number of cell values must be equal to the number of rows times the number of columns.

The current implementation of maxent requires fields xllcorner, yllcorner and nodata_value.


How it works

This is a very brief description -- for more details, please see the papers described above.  Here we first describe an unregularized version (with the regularization value set to zero); in practice, we always recommend to use regularization. Without regularization, the distribution being computed is the one that has maximum entropy among those satisfying the constraints that the expectation of each feature matches its empirical average.  This distribution can be proved to be the same as the Gibbs distribution that maximizes the product of the probabilities of the sample locations, where a Gibbs distribution takes the form

   P(x) = exp(c1 * f1(x) + c2 * f2(x) + c3 * f3(x) ...) / Z

Here c1, c2, ... are constants, f1, f2, ... are the features, and Z is a scaling constant that ensures that P sums to 1 over all grid cells.  The algorithm that is implemented by this program is guaranteed to converge to values of c1, c2, ..., that give the (unique) optimum distribution P.

For each species, the program starts with a uniform distribution, and performs a number of iterations, each of which increases the probability of the sample locations for the species.  The probability is displayed in terms of "gain", which is the log of the number of grid cells minus the log loss (average of the negative log probabilities of the sample locations).  The gain starts at zero (the gain of the uniform distribution), and increases as the program increases the probabilities of the sample locations.  The gain increases iteration by iteration, until the change from one iteration to the next falls below the convergence threshold, or until maximum iterations have been performed.

In the regularized case, the gain is lower by an additional term, which is the weighted sum of the absolute values of c1, c2, ... .  This limits overfitting and prevents c1, c2, ...  from becoming arbitrarily large. Minimizing the regularized loss (or equivalently, maximizing the regularized gain) corresponds to maximizing the entropy of the distribution subject to a relaxed constraint that feature expectations be only close to feature averages over sample locations rather than exactly equal to them.

 

Regularization and feature class selection

The predictive performance of the MaxEnt is influenced by the choice of feature types and the regularization constants.  Here we describe the default settings, which can be overridden, if desired, using the command line flags described below.  By default (i.e., when using "Auto features"), all feature types are used when there are at least 80 training samples;  from 15 to 79 samples, linear, quadratic and hinge features are used;  from 10 to 14 samples, linear and quadratic features are used;  below 10 samples, only linear features are used.

The default values for the constants c1, c2 described above is an empirically tuned value (called "beta", and depending on the feature type and the number of samples) divided by the square root of the number of samples.  The default values for beta for the various feature types are given in the following tables, with interpolation in between:


Linear (2-9 samples)
Sample size 0 10 30 100+
Beta 1.0 1.0 0.2 0.05

Linear + Quadratic (10-79 samples)
Sample size 0 10 17 30 100+
Beta 1.3 0.8 0.5 0.25 0.05

Linear + Quadratic + Product (80+ samples)
Sample size 0 10 17 30 100+
Beta 2.6 1.6 0.9 0.55 0.05

Threshold (80+ samples)
Sample size 0 100+
Beta 2.0 1.0

Hinge (15+ samples)
Sample size 0+
Beta 0.5

Categorical (15+ samples)
Sample size 0+ 10 17+
Beta 0.65 0.5 0.25

Projections

The values of c1, c2, ... and Z that were computed for features derived from the "Environmental layers" are used to compute weights using the layers in the "Projection directory".  Note that these weights are not probabilities and they need not sum to one since they use the normalization constant computed for "Environmental layers" rather than the one for "Projection directory". Their relative magnitudes represent how much a given locale is favored by the species over another locale. For each species, the weights are written in a file mySpecies_<dir>.asc in the output directory, where <dir> is the name of the projection directory.  By default, two kinds of "clamping" are done during the projection process.  First, the environmental layers are clamped: if a layer in the projection directory has values that are greater than the maximum of the corresponding layer used during training, those values are reduced to the maximum, and similarly for values below the corresponding minimum.  Second, features are also clamped: if a feature derived from the projection layers has value greater than its maximum on the training data, it is reduced to the maximum, and similarly for values below the corresponding minimum. Both forms of clamping help to alleviate problems that can arise from making predictions outside the range of data used in training the model.

Background Points

As described above, the maxent distribution is calculated over the set of pixels that have data for all environmental variables.  However, if the number of pixels is very large, processing time increases without a significant improvement in modeling performance.  For that reason, when the number of pixels with data is larger than 10,000 a random sample of 10,000 "background" pixels is used to represent the variety of environmental conditions present in the data.  The maxent distribution is then computed over the union of the "background" pixels and the samples for the species being modeled.  The number 10,000 can be changed from the "Settings" panel, or by using a command-line flag: see the batch-mode section below.

Memory Issues

If the environmental layers are very large files, you may get "out of memory" or "heap space" errors when you try to run the program.  There are a number of ways to address this problem.  

Format of the lambda file

The coefficients of the Maxent model for a species are output in a file called species.lambdas. The entries in the lambdas file are lines of the form: feature, lambda, min, max. The exponent of the Maxent model is calculated as

exponent = lambda1 * (f1(x)-min1)/(max1 - min1) + ... + lambdan * (fn(x)-minn)/(maxn -minn) - linearPredictorNormalizer

In other words, features are scaled so that their values would lie between 0 and 1 on the training data. By default, all features are clamped prior to projection of the model onto new data - see section "Projections" above. The linearPredictorNormalizer is a constant chosen so that the exponent is always non-positive (for numerical stability). Terms corresponding to hinge features are evaluated slightly differently. For example, the hinge feature prec' derived from the layer prec and described by the line: prec', lambda, min, max evaluates to the term

lambda * clamp_at_0(prec-min)/(max-min)

i.e., if prec< min then the value is 0 otherwise it is (prec-min)/(max-min). For the reverse hinge feature prec`, lambda, min, max, the term is

lambda * clamp_at_0(max-prec)/(max-min)

The densityNormalizer is the normalization term Z calculated over the background. The Maxent raw output is therefore:

raw = exp(sum lambdai * (fi(x)-mini)/(maxi - mini ) - linearPredictorNormalizer) / densityNormalizer

Lastly, logistic output is calculated using the entropy given at the end of the lambda file: logistic = raw * exp(entropy) / (1 + raw * exp(entropy)).



Batch mode

All parts of the interface can be set from the command line, and the Run button can be automatically pressed after startup.  This allows for the program to be invoked in batch mode, multiple times in sequence, if required.  The command line flags can also be added to the maxent.bat file, at the end of the "java ..." line, to change the default settings of the program. Some of the more common flags have abbreviations that can be used instead of the full flag. As an example, the following two invocations are equivalent:

java -mx512m -jar maxent.jar environmentallayers=layers samplesfile=samples\bradypus.csv outputdirectory=outputs togglelayertype=ecoreg redoifexists autorun

java -mx512m -jar maxent.jar -e layers -s samples\bradypus.csv -o outputs -t ecoreg -r -a

Any boolean flag can be given the prefix "no" or "dont" to turn the flag off. Abbreviations for boolean flags toggle the default value. The available flags are, in no particular order: