3. Find the area of origin and the history of past contacts: RRphylogeography

Alessandro Mondanaro, Silvia Castiglione, Pasquale Raia

Index

  1. Introduction
  2. Formatting the data
  3. RRphylogeography
  4. RRphylogeography outputs

1. Introduction

The function RRphylogeography is designed to find the area of origin (AOO) and/or identify suitable area of contact between a pair of target species, taking into account their evolutionary history and environmental preferences. To run RRphylogeography, the user must provide the presence datapoints and the habitat suitability (HS) maps for both target species. The outputs of RRphylogeography consists of relative probability of occurrence (RPO) maps, which are calculated based on three elements: the habitat suitability values in each grid cell, the kernel density estimations derived from the occurrences, and the spatial distance (inclusive of the path cost for moving) between the most suitable cells of the two target species. The user can customize several parameters in RRphylogeography to account for different phylogenetic relationships between the species pair (i.e. whether budding cladogenesis, branching cladogenesis or anagenesis are suspected to have occurred) and to adjust the conductance matrix which affects the difficulty of moving across the landscape.

In the following chunks, we provide a detailed walkthrough on how to format data and run RRphylogeography to identify the suitable area of contact between Ursus arctos and Ursus maritimus (Mondanaro et al. 2025) at 35 kya.

Important: RRphylogeography is able to work with the habitat suitability (HS) maps generated by any Species Distribution Models (SDMs) algorithm, as long as they are in the 0-1 range. This flexibility allows users to calibrate their SDMs using statistical/regression models, similarity/envelope models, and machine learning methods, depending on their preferred approach. By accommodating a wide range of modeling techniques, RRphylogeography ensures broad applicability across different ecological and evolutionary studies.

2. Formatting data

Other than the name of the target species (i.e., spec1 and spec2 arguments), RRphylogeography requires two mandatory arguments:

Both lists must be named and ordered according to spec1 and spec2.

Note: RRphylogeography can also process a list of stacked SpatRaster objects. This is particularly useful when users have SDMs projected across different time frames as it allows to input the habitat suitability maps all together.

library(RRgeo)
library(terra)
library(sf)

rast(system.file("exdata/U.arctos_suitability.tif", package="RRgeo"))->map1
rast(system.file("exdata/U.maritimus_suitability.tif", package="RRgeo"))->map2
load(system.file("exdata/Ursus_occurrences.RDa", package="RRgeo"))
list(Ursus_arctos=map1,Ursus_maritimus=map2)->pred
list(Ursus_arctos=occs_arctos,Ursus_maritimus=occs_marit)->occs
head(occs$Ursus_arctos)
#> Simple feature collection with 6 features and 7 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -6268427 ymin: 4829599 xmax: 4561482 ymax: 7759100
#> Projected CRS: PROJCRS["unknown",
#>     BASEGEOGCRS["unknown",
#>         DATUM["D_Unknown_based_on_WGS84_ellipsoid",
#>             ELLIPSOID["WGS 84",6378137,298.257223563,
#>                 LENGTHUNIT["metre",1],
#>                 ID["EPSG",7030]]],
#>         PRIMEM["Greenwich",0,
#>             ANGLEUNIT["Degree",0.0174532925199433]]],
#>     CONVERSION["unnamed",
#>         METHOD["Mollweide"],
#>         PARAMETER["Longitude of natural origin",0,
#>             ANGLEUNIT["Degree",0.0174532925199433],
#>             ID["EPSG",8802]],
#>         PARAMETER["False easting",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8806]],
#>         PARAMETER["False northing",0,
#>             LENGTHUNIT["metre",1],
#>             ID["EPSG",8807]]],
#>     CS[Cartesian,2],
#>         AXIS["(E)",east,
#>             ORDER[1],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]],
#>         AXIS["(N)",north,
#>             ORDER[2],
#>             LENGTHUNIT["metre",1,
#>                 ID["EPSG",9001]]]]
#>     OBS TIME_factor     bio4   bio5  bio15    bio18    bio19
#> 1     1       0.000 2231.449 15.892 44.463 1298.594  545.262
#> 2     1       0.000  569.361 25.074 47.080 1285.298 3269.817
#> 72    1       0.001 1834.911 10.961 42.652 1014.167  528.589
#> 108   1       0.002 1615.390 19.783 27.550 1226.768 1046.242
#> 109   1       0.002 1614.916 20.613 23.908 1143.378 1053.529
#> 110   1       0.002 1705.699 19.869 36.797 1459.522 1003.189
#>                      geometry
#> 1     POINT (4561482 7500426)
#> 2   POINT (-444307.6 4829599)
#> 72   POINT (-6268427 7759100)
#> 108   POINT (4152689 6174048)
#> 109   POINT (4151493 6136813)
#> 110   POINT (4365559 6321109)

The standard format for individual elements of occs must include the column containing species occurrence data in binary format (“OBS”) and the “geometry” column. Additionally, if available, a column representing the time intervals associated with each species presence may be added (“TIME_factor”). All other columns in the data.frame are ignored.

3. Run RRphylogeography

RRphylogeography also accepts:

yourdir<-"YOUR_DIRECTORY"
setwd(yourdir)

RRphylogeography(spec1="Ursus_arctos", 
                 spec2="Ursus_maritimus",
                 pred=pred, 
                 occs=occs,
                 aggr=2,
                 time_col="TIME_factor",
                 kde_inversion=FALSE,
                 resistance_map=NULL,
                 clust=0.5,
                 plot=FALSE,
                 mask_for_pred=NULL,
                 standardize=TRUE,
                 output.dir=yourdir)

Important: The computational time of RRphylogeography is influenced by several factors, including the number of suitable cells for each target species, the th value used to define the most suitable cells, and the number of habitat suitability layers. To speed up the procedure, we strongly recommend aggregating the habitat suitability (HS) maps provided as pred by setting the aggr argument. Since the primary goal of RRphylogeography is to identify the area of origin (rather than a single cell of origin) between two target species, using very high spatial resolution maps may not be necessary. The spatial aggregation can drastically reduce computational time without compromising the analysis. Alternatively, users can increase the th values or restrict the study area by using the mask_for_pred argument.

4. RRphylogeography outputs

RRphylogeography creates within output.dir an output folder named by combining the names of the two target species. Therein, RRphylogeography stores three SpatRaster maps.

  1. A SpatRaster object with five layers for spec1
  2. A SpatRaster object with five layers for spec2

Both these files include:

  1. Habitat suitability (HS) map based on threshold values.
  2. Kernel density estimated from the species occurrence record.
  3. Distance calculations (“proximity”) between the first and the second target species.
  4. Relative probability of occurrence (RPO) map.
  5. Relative probability of occurrence (RPO) map excluding the kernel density.
  1. A raster object with two layers:
  1. RPO map which integrates the HS values, kernel density and the distance calculations (“proximity”) for both target species (“RPO_combined”).
  2. RPO map without considering the kernel density (“RPO_combined_noK”).

The names of the three output files are a combination of species names and the names of the HS maps.

Here, we plot the RPO_combined map of U. arctos and U. maritimus at 35 kya. The RPO_combined map returned by RRphylogeography was enhanced to replicate the one as in Mondanaro et al. (2025).

References