307 research outputs found
Recommended from our members
Can machine-learning algorithms improve upon classical palaeoenvironmental reconstruction models?
Classical palaeoenvironmental reconstruction models often incorporate biological ideas and commonly assume that the taxa comprising a fossil assemblage exhibit unimodal response functions of the environmental variable of interest. In contrast, machine-learning approaches do not rely upon any biological assumptions but instead need training with large data sets to extract some understanding of the relationships between biological assemblages and their environment. To explore the relative merits of these two approaches, we have developed a two-layered machine-learning reconstruction model MEMLM (Multi Ensemble Machine Learning Model). The first layer applies three different ensemble machine-learning models (random forests, extra random trees, and LightGBM), trained on the modern taxon assemblage and associated environmental data to make reconstructions based on the three different models, while the second layer uses multiple linear regression to integrate these three reconstructions into a consensus reconstruction. We considered three versions of the model: (1) a standard version of MEMLM, which uses only taxon abundance data; (2) MEMLMe, which uses only dimensionally reduced assemblage information, using a natural language-processing model (GloVe), to detect associations between taxa across the training data set; and (3) MEMLMc which incorporates both raw taxon abundance and dimensionally reduced summary (GloVe) data. We trained these MEMLM model variants with three high-quality diatom and pollen training sets and compared their reconstruction performance with three weighted-averaging (WA) approaches (WA-Cla for classical deshrinking, WA-Inv for inverse deshrinking, and WA-PLS for partial least squares). In general, the MEMLM approaches, even when trained on only dimensionally reduced assemblage data, performed substantially better than the WA approaches in the larger training sets, as judged by cross-validatory prediction error. When applied to fossil data, MEMLM variants sometimes generated qualitatively different palaeoenvironmental reconstructions from each other and from reconstructions based on WA approaches. We applied a statistical significance test to all the reconstructions. This successfully identified each incidence for which the reconstruction is not robust with respect to the model choice. We found that machine-learning approaches could outperform classical approaches but could sometimes fail badly in the reconstruction, despite showing high performance under cross-validation, likely indicating problems when extrapolation occurs. We found that the classical approaches are generally more robust, although they could also generate reconstructions which have modest statistical significance and therefore may be unreliable. Given these conclusions, we consider that cross-validation is not a sufficient measure of transfer function performance, and we recommend that the results of statistical significance tests are provided alongside the downcore reconstructions based on fossil assemblages
BUMPER v1.0: a Bayesian user-friendly model for palaeo-environmental reconstruction
We describe the Bayesian user-friendly model for palaeo-environmental reconstruction (BUMPER), a Bayesian transfer function for inferring past climate and other environmental variables from microfossil assemblages. BUMPER is fully self-calibrating, straightforward to apply, and computationally fast, requiring ~2 s to build a 100-taxon model from a 100-site training set on a standard personal computer. We apply the model’s probabilistic framework to generate thousands of artificial training sets under ideal assumptions.We then use these to demonstrate the sensitivity of reconstructions to the characteristics of the training set, considering assemblage richness, taxon tolerances, and the number of training sites. We find that a useful guideline for the size of a training set is to provide, on average, at least 10 samples of each taxon. We demonstrate general applicability to real data, considering three different organism types (chironomids, diatoms, pollen) and different reconstructed variables. An identically configured model is used in each application, the only change being the input files that provide the training-set environment and taxon-count data. The performance of BUMPER is shown to be comparable with weighted average partial least squares (WAPLS) in each case. Additional artificial datasets are constructed with similar characteristics to the real data, and these are used to explore the reasons for the differing performances of the different training sets
Glacial legacies on interglacial vegetation at the Pliocene-Pleistocene transition in NE Asia
Broad-scale climate control of vegetation is widely assumed. Vegetation-climate lags are generally thought to have lasted no more than a few centuries. Here our palaeoecological study challenges this concept over glacial–interglacial timescales. Through multivariate analyses of pollen assemblages from Lake El’gygytgyn, Russian Far East and other data we show that interglacial vegetation during the Plio-Pleistocene transition mainly reflects conditions of the preceding glacial instead of contemporary interglacial climate. Vegetation–climate disequilibrium may persist for several millennia, related to the combined effects of permafrost persistence, distant glacial refugia and fire. In contrast, no effects from the preceding interglacial on glacial vegetation are detected. We propose that disequilibrium was stronger during the Plio-Pleistocene transition than during the Mid-Pliocene Warm Period when, in addition to climate, herbivory was important. By analogy to the past, we suggest today’s widespread larch ecosystem on permafrost is not in climate equilibrium. Vegetation-based reconstructions of interglacial climates used to assess atmospheric CO2–temperature relationships may thus yield misleading simulations of past global climate sensitivity
“Think horizontally, act vertically” : the centenary (1916–2016) of pollen analysis and the legacy of Lennart von Post
Peer reviewedPublisher PD
Tree migration-rates : narrowing the gap between inferred post-glacial rates and projected rates
Faster-than-expected post-glacial migration rates of trees have puzzled ecologists for a long time. In Europe, post-glacial migration is assumed to have started from the three southern European peninsulas (southern refugia), where large areas remained free of permafrost and ice at the peak of the last glaciation. However, increasing palaeobotanical evidence for the presence of isolated tree populations in more northerly microrefugia has started to change this perception. Here we use the Northern Eurasian Plant Macrofossil Database and palaeoecological literature to show that post-glacial migration rates for trees may have been substantially lower (60–260 m yr–1) than those estimated by assuming migration from southern refugia only (115–550 m yr–1), and that early-successional trees migrated faster than mid- and late-successional trees. Post-glacial migration rates are in good agreement with those recently projected for the future with a population dynamical forest succession and dispersal model, mainly for early-successional trees and under optimal conditions. Although migration estimates presented here may be conservative because of our assumption of uniform dispersal, tree migration-rates clearly need reconsideration. We suggest that small outlier populations may be a key factor in understanding past migration rates and in predicting potential future range-shifts. The importance of outlier populations in the past may have an analogy in the future, as many tree species have been planted beyond their natural ranges, with a more beneficial microclimate than their regional surroundings. Therefore, climate-change-induced range-shifts in the future might well be influenced by such microrefugia
Temperature reconstructions for the last 1.74-Ma on the eastern Tibetan Plateau based on a novel pollen-based quantitative method
Terrestrial palaeo-temperature data are of great value in improving our understanding of past climate and they provide a basis for evaluating climate simulations. Such data are, however, poorly constrained for long time-scales. In addition to the scarcity of high-quality continuous time-series, finding proxies with a clear response to past temperature changes and developing appropriate reconstruction methods are major challenges. We present a new and robust method – Locally-weighted Weighted-average partial least squares (LW-WAPLS) to reconstruct quantitative temperature changes based on a high-resolution 1.74-Ma pollen record from the Zoige Basin on the eastern Tibetan Plateau, where the vegetation today is mainly controlled by temperature. The reconstructed mean annual (MAT) and warmest month (MTWM) temperatures reveal a general cooling trend with two major shifts at ~1.54 and 0.62 Ma BP, and regular glacial-interglacial variability ranging from ~ − 4 to 2 °C and from 8 to 16 °C, respectively. They indicate ~4–5 °C (MAT) and ~ 5–6 °C (MTWM) magnitudes of glacial-interglacial temperatures. Both statistical and ecological evaluations validate the reliability of the reconstructions. The reconstructions provide important insights into the spatial aspects of long-term terrestrial temperature change. LW-WAPLS shows advantages over both the traditional modern analogue technique and non-linear transfer-function methodologies such as WAPLS for reconstructing the broad-scale climate changes for the Zoige Basin, by combining the strength of both methods. The LW-WAPLS approach potentially provides a robust tool to develop pollen-based climate reconstructions over long time-scales
European pollen-based REVEALS land-cover reconstructions for the Holocene: methodology, mapping and potentials
Quantitative reconstructions of past land cover are necessary to determine the processes involved in climate–human–land-cover interactions. We present the first temporally continuous and most spatially extensive pollen-based land-cover reconstruction for Europe over the Holocene (last 11 700 cal yr BP). We describe how vegetation cover has been quantified from pollen records at a 1∘ × 1∘ spatial scale using the “Regional Estimates of VEgetation Abundance from Large Sites” (REVEALS) model. REVEALS calculates estimates of past regional vegetation cover in proportions or percentages. REVEALS has been applied to 1128 pollen records across Europe and part of the eastern Mediterranean–Black Sea–Caspian corridor (30–75∘ N, 25∘ W–50∘ E) to reconstruct the percentage cover of 31 plant taxa assigned to 12 plant functional types (PFTs) and 3 land-cover types (LCTs). A new synthesis of relative pollen productivities (RPPs) for European plant taxa was performed for this reconstruction. It includes multiple RPP values (≥2 values) for 39 taxa and single values for 15 taxa (total of 54 taxa). To illustrate this, we present distribution maps for five taxa (Calluna vulgaris, Cerealia type (t)., Picea abies, deciduous Quercus t. and evergreen Quercus t.) and three land-cover types (open land, OL; evergreen trees, ETs; and summer-green trees, STs) for eight selected time windows. The reliability of the REVEALS reconstructions and issues related to the interpretation of the results in terms of landscape openness and human-induced vegetation change are discussed. This is followed by a review of the current use of this reconstruction and its future potential utility and development. REVEALS data quality are primarily determined by pollen count data (pollen count and sample, pollen identification, and chronology) and site type and number (lake or bog, large or small, one site vs. multiple sites) used for REVEALS analysis (for each grid cell). A large number of sites with high-quality pollen count data will produce more reliable land-cover estimates with lower standard errors compared to a low number of sites with lower-quality pollen count data. The REVEALS data presented here can be downloaded from https://doi.org/10.1594/PANGAEA.937075 (Fyfe et al., 2022)
Topography-driven isolation, speciation and a global increase of endemism with elevation
Aim: Higher-elevation areas on islands and continental mountains tend to be separated by longer distances, predicting higher endemism at higher elevations; our study is the first to test the generality of the predicted pattern. We also compare it empirically with contrasting expectations from hypotheses invoking higher speciation with area, temperature and species richness.
Location: Thirty-two insular and 18 continental elevational gradients from around the world.
Methods: We compiled entire floras with elevation-specific occurrence information, and calculated the proportion of native species that are endemic (‘percent endemism’) in 100-m bands, for each of the 50 elevational gradients. Using generalized linear models, we tested the relationships between percent endemism and elevation, isolation, temperature, area and species richness.
Results: Percent endemism consistently increased monotonically with elevation, globally. This was independent of richness–elevation relationships, which had varying shapes but decreased with elevation at high elevations. The endemism–elevation relationships were consistent with isolation-related predictions, but inconsistent with hypotheses related to area, richness and temperature.
Main conclusions: Higher per-species speciation rates caused by increasing isolation with elevation are the most plausible and parsimonious explanation for the globally consistent pattern of higher endemism at higher elevations that we identify. We suggest that topography-driven isolation increases speciation rates in mountainous areas, across all elevations and increasingly towards the equator. If so, it represents a mechanism that may contribute to generating latitudinal diversity gradients in a way that is consistent with both present-day and palaeontological evidence
Creating spatially continuous maps of past land cover from point estimates: A new statistical approach applied to pollen data
International audienceReliable estimates of past land cover are critical for assessing potential effects of anthropogenic land-cover changes on past earth surface-climate feedbacks and landscape complexity. Fossil pollen records from lakes and bogs have provided important information on past natural and human-induced vegetation cover. However, those records provide only point estimates of past land cover, and not the spatially continuous maps at regional and sub-continental scales needed for climate modelling. We propose a set of statistical models that create spatially continuous maps of past land cover by combining two data sets: 1) pollen-based point estimates of past land cover (from the REVEALS model) and 2) spatially continuous estimates of past land cover, obtained by combining simulated potential vegetation (from LPJ-GUESS) with an anthropogenic land-cover change scenario (KK10). The proposed models rely on statistical methodology for compositional data and use Gaussian Markov Random Fields to model spatial dependencies in the data. Land-cover reconstructions are presented for three time windows in Europe: 0.05, 0.2, and 6 ka years before present (BP). The models are evaluated through cross-validation, deviance information criteria and by comparing the reconstruction of the 0.05 ka time window to the present-day land-cover data compiled by the European Forest Institute (EFI). For 0.05 ka, the proposed models provide reconstructions that are closer to the EFI data than either the REVEALS-or LPJ-GUESS/KK10-based estimates; thus the statistical combination of the two estimates improves the reconstruction. The reconstruction by the proposed models for 0.2 ka is also good. For 6 ka, however, the large differences between the REVEALS-and LPJ-GUESS/KK10-based estimates reduce the reliability of the proposed models. Possible reasons for the increased differences between REVEALS and LPJ-GUESS/KK10 for older time periods and further improvement of the proposed models are discussed
- …