Tools for modeling niches and distributions of species
enmSdmX
is a set of tools in R for implementing
species distribution models (SDMs) and ecological niche models (ENMs),
including: bias correction, spatial cross-validation, model evaluation,
raster interpolation, biotic velocity (speed and direction of movement
of a “mass” represented by a raster), and tools for using spatially
imprecise records. The heart of the package is a set of “training”
functions which automatically optimize model complexity based number of
available occurrences. These algorithms include MaxEnt, MaxNet, boosted
regression trees/gradient boosting machines (BRT), generalized additive
models (GAM), generalized linear models (GLM), natural splines (NS), and
random forests (RF). To enhance interoperability with other packages,
the package does not create any new classes. The package works with
PROJ6 geodetic objects and coordinate reference systems.
You can install this package from CRAN using:
install.packages('enmSdmX', dependencies = TRUE)
Alternatively, you can install the development version of this package using:
remotes::install_github('adamlilith/enmSdmX', dependencies = TRUE)
You may need to install the remotes
package first.
coordImprecision
: Coordinate imprecisionnearestGeogPoints
: Minimum convex polygon from a set of
spatial polygons and/or points (“nearest geographic point” method)nearestEnvPoints
: Extract “most conservative”
environments from points and/or polygons (“nearest environmental point”
method)elimCellDuplicates
: Eliminate duplicate points in each
cell of a rastergeoFold
: Assign geographically-distinct k-foldsgeoFoldContrast
: Assign geographically-distinct k-folds
to background or contrast sitesgeoThin
: Thin geographic points deterministically or
randomlyweightByDist
: Proximity-based weighting for occurrences
for correcting spatial biastrainByCrossValid
and summaryByCrossValid
:
Calibrate a distribution/niche model using cross-validationtrainBRT
: Boosted regression trees (BRTs)trainESM
: Ensembles of small models (ESMs)trainGAM
: Generalized additive models (GAMs)trainGLM
: Generalized linear models (GLMs)trainMaxEnt
: MaxEnt modelstrainMaxNet
: MaxNet modelstrainNS
: Natural splines (NSs)trainRF
: Random forests (RFs)predictEnmSdm
: Predict most model types using default
settings; parallelizedpredictMaxEnt
: Predict MaxEnt modelpredictMaxNet
: Predict MaxNet modelevalAUC
: AUC (with/out site weights)evalMultiAUC
: Multivariate version of AUC (with/out
site weight)evalContBoyce
: Continuous Boyce Index (with/out site
weights)evalThreshold
: Thresholds to convert continuous
predictions to binary predictions (with/out site weights)evalThresholdStats
: Model accuracy based on thresholded
predictions (with/out site weights)evalTjursR2
: Tjur’s R2 (with/out site weights)evalTSS
: True Skill Statistic (TSS) (with/out site
weights)modelSize
: Number of response values in a model
objectcompareResponse
: Compare different niche model
responses along an environmental variablenicheOverlapMetrics
: Niche overlap metricsbioticVelocity
: Velocity of a “mass” across a time
series of rastersgetValueByCell
and setValueByCell
:
Retrieve or get raster values(s) by cell numberglobalx
: “Friendly” wrapper for terra::global() for
calculatig raster statisticsinterpolateRasts
: Interpolate a stack of rasterslongLatRasts
: Generate rasters with values of
longitude/latitude for cell valuessampleRast
: Sample raster with/out replacementsquareCellRast
: Create a raster with square cells from
an object with an extentcrss
: Coordinate reference systems and their
nicknamescustomAlbers
: Create a custom Albers conic equal-area
projectioncustomLambert
: Create a custom Lambert azimuthal
equal-area projectioncustomVNS
: Create a custom vertical near-side
projectiongetCRS
: Return a WKT2 (well-known text) string using a
nicknamecountPoints
: Number of points in a “spatial points”
objectdecimalToDms
: Convert decimal coordinate to
degrees-minutes-secondsdmsToDecimal
: Convert degrees-minutes-seconds
coordinate to decimalextentToVect
: Convert extent to a spatial polygonplotExtent
: Create a spatial polygon the same size as a
plot regionspatVectorToSpatial
: Convert SpatVector object to a
Spatial* objectlemurs
: Lemur occurrencesmad0
: Madagascar spatial objectmad1
: Madagascar spatial objectmadClim
: Madagascar climate rasters for the
presentmadClim2030
: Madagascar climate rasters for the
2030smadClim2050
: Madagascar climate rasters for the
2050smadClim2070
: Madagascar climate rasters for the
2070smadClim2090
: Madagascar climate rasters for the
2090sSmith, A.B., Murphy, S.J., Henderson, D., and Erickson, K.D. 2023. Including imprecisely georeferenced specimens improves accuracy of species distribution models and estimates of niche breadth. Global Ecology and Biogeography In press. [open access pre-print | published article]
Abstract
Aim Museum and herbarium specimen records are frequently used to assess the conservation status of species and their responses to climate change. Typically, occurrences with imprecise geolocality information are discarded because they cannot be matched confidently to environmental conditions and are thus expected to increase uncertainty in downstream analyses. However, using only precisely georeferenced records risks undersampling of the environmental and geographical distributions of species. We present two related methods to allow the use of imprecisely georeferenced occurrences in biogeographical analysis.
Innovation Our two procedures assign imprecise records to the (1) locations or (2) climates that are closest to the geographical or environmental centroid of the precise records of a species. For virtual species, including imprecise records alongside precise records improved the accuracy of ecological niche models projected to the present and the future, especially for species with c. 20 or fewer precise occurrences. Using only precise records underestimated loss of suitable habitat and overestimated the amount of suitable habitat in both the present and the future. Including imprecise records also improves estimates of niche breadth and extent of occurrence. An analysis of 44 species of North American Asclepias (Apocynaceae) yielded similar results.
Main conclusions Existing studies examining the effects of
spatial imprecision typically compare outcomes based on precise records
against the same records with spatial error added to them. However, in
real-world cases, analysts possess a mix of precise and imprecise
records and must decide whether to retain or discard the latter.
Discarding imprecise records can undersample the geographical and
environmental distributions of species and lead to mis-estimation of
responses to past and future climate change. Our method, for which we
provide a software implementation in the enmSdmX
package
for R, is simple to use and can help leverage the large number of
specimen records that are typically deemed “unusable” because of spatial
imprecision in their geolocation.