    A systematic florula of a disturbed urban habitat: pavements of Sheffield, England

    Human settlements are of increasing interest to ecologists, a fact demonstrated by the recent cluster of book-length treatments of the topic (Forman 2008, McDonnell et al. 2009, Gaston 2010, NiemelĂ€ et al. 2011, Wilson 2011, Forman 2014). The natural world as a fascinating feature of towns and cities has a much longer history (e.g. Fitter 1945), and has also played a strong part in local biological conservation in some countries over the late 20th Century (Goode 2014​). Despite much existing information on urban plant and animal communities resulting from these trends, very little, easily accessible, systematic data on urban biodiversity is currently available. Few systematic, randomised surveys at fine spatial grain exist for urban habitats, and even fewer of these surveys are in the public domain. This study was designed as a systematic florula (i.e. a small flora) of a relatively discrete urban habitat in order to provide a baseline that would enable robust insights into future environmental change. In addition, the dataset is likely to be useful for comparative studies of plant traits, particularly those of highly disturbed habitats (Williams et al. 2009​). The survey is an occupancy study of the vascular plants of pavements (i.e. sidewalks) within 16 500 x 500 m (0.25 km2) urban grid cells, stratified by quadrant at the scale of the focal city (Sheffield, England) in order to provide more even coverage. The final dataset comprises 862 records of 183 taxa

    Reassessing the observational evidence for nitrogen deposition impacts in acid grassland: spatial Bayesian linear models indicate small and ambiguous effects on species richness

    Nitrogen deposition (Ndep) is considered a significant threat to plant diversity in grassland ecosystems around the world. The evidence supporting this conclusion comes from both observational and experimental research, with “space-for-time” substitution surveys of pollutant gradients a significant portion of the former. However, estimates of regression coefficients for Ndep impacts on species richness, derived with a focus on causal inference, are hard to locate in the observational literature. Some influential observational studies have presented estimates from univariate models, overlooking the effects of omitted variable bias, and/or have used P-value-based stepwise variable selection (PSVS) to infer impacts, a strategy known to be poorly suited to the accurate estimation of regression coefficients. Broad-scale spatial autocorrelation has also generally been unaccounted for. We re-examine two UK observational datasets that have previously been used to investigate the relationship between Ndep and plant species richness in acid grasslands, a much-researched habitat in this context. One of these studies (Stevens et al., 2004, Science, 303: 1876–1879) estimated a large negative impact of Ndep on richness through the use of PSVS; the other reported smaller impacts (Maskell et al., 2010, Global Change Biology, 16: 671–679), but did not explicitly report regression coefficients or partial effects, making the actual size of the estimated Ndep impact difficult to assess. We reanalyse both datasets using a spatial Bayesian linear model estimated using integrated nested Laplace approximation (INLA). Contrary to previous results, we found similar-sized estimates of the Ndep impact on plant richness between studies, both with and without bryophytes, albeit with some disagreement over the most likely direction of this effect. Our analyses suggest that some previous estimates of Ndep impacts on richness from space-for-time substitution studies are likely to have been over-estimated, and that the evidence from observational studies could be fragile when confronted with alternative model specifications, although further work is required to investigate potentially nonlinear responses. Given the growing literature on the use of observational data to estimate the impacts of pollutants on biodiversity, we suggest that a greater focus on clearly reporting important outcomes with associated uncertainty, the use of techniques to accou URL link.nt for spatial autocorrelation, and a clearer focus on the aims of a study, whether explanatory or predictive, are all required

    Assessing the exposure of UK habitats to 20th- and 21st-century climate change, and its representation in ecological monitoring schemes

    1. Climate change is a significant driver of contemporary biodiversity change. Ecological monitoring schemes can be crucial in highlighting its consequences, but connecting and interpreting observed climatic and ecological changes demands an understanding of monitored locations' exposure to climate change. Generalising from trends in monitored sites to habitats also requires an assessment of how closely sampled locations' climate change trajectories mirror those of wider ecosystems. Such assessments are rare but vital for drawing robust ecological conclusions.  2. Focusing on the UK, we generated a metric of climate change exposure by quantifying the change in observed historical (1901–2019) and predicted future (2021–2080, pessimistic emissions scenario) conditions. We then assessed habitat-specific climate change exposure by overlaying the resulting data with maps of contemporary (2019) land cover. Finally, we compared patterns of climate change exposure in locations sampled by ecological monitoring schemes to random samples from wider habitats.  3. The UK's climate changed significantly between the early 20th century and the last decade, and is predicted to undergo even greater changes (including the development of Iberian/Mediterranean climate types in places) into the 21st century. Climate change exposure is unevenly distributed: regionally, it falls more in southern, central and eastern England; locally, it is greater at higher-elevation locations than nearby areas at lower elevations.  4. Areas with contemporary arable and horticulture, urban, calcareous grassland and suburban land cover are predicted to experience the greatest overall climatic change, though other habitats experienced relatively greater change than these in the first half of the 20th century.  5. The extent to which locations sampled by ecological monitoring schemes represent broader habitat-level gradients of climate change exposure varies. Monitored sites' coverage of wider trends is heterogeneous across habitats, time periods and schemes.  6. Policy implications. UK ecological monitoring schemes can effectively, though variably, capture the effects of climate change on habitats. To improve their performance, climate change could be explicitly included in the design of such programmes. Additionally, our findings on how effectively different datasets represent wider patterns of climate change are crucial for informing syntheses of ecological change connected to shifting atmospheric conditions

    Improving species distribution models for invasive non‐native species with biologically informed pseudo‐absence selection

    Aim: We present a novel strategy for species distribution models (SDMs) aimed at predicting the potential distributions of range‐expanding invasive non‐native species (INNS). The strategy combines two established perspectives on defining the background region for sampling “pseudo‐absences” that have hitherto only been applied separately. These are the accessible area, which accounts for dispersal constraints, and the area outside the environmental range of the species and therefore assumed to be unsuitable for the species. We tested an approach to combine these by fitting SDMs using background samples (pseudo‐absences) from both types of background. Location: Global. Taxon: Invasive non‐native plants: Humulus scandens, Lygodium japonicum, Lespedeza cuneata, Triadica sebifera, Cinnamomum camphora. Methods: Presence‐background (or presence‐only) SDMs were developed for the potential global distributions of five plant species native to Asia, invasive elsewhere and prioritised for risk assessment as emerging INNS in Europe. We compared models where the pseudo‐absences were selected from the accessible background, the unsuitable background (defined using biological knowledge of the species’ key limiting factors) or from both types of background. Results: Combining the unsuitable and accessible backgrounds expanded the range of environments available for model fitting and caused biological knowledge about ecological unsuitability to influence the fitted species‐environment relationships. This improved the realism and accuracy of distribution projections globally and, generally, within the species’ ranges. Main conclusions: Correlative SDMs remain valuable for INNS risk mapping and management, but are often criticised for a lack of biological underpinning. Our approach partly addresses this concern by using prior knowledge of species’ requirements or tolerances to define the unsuitable background for modelling, while also accommodating dispersal constraints through considerations of accessibility. It can be implemented with current SDM software and results in more accurate and realistic distribution projections. As such, wider adoption has potential to improve SDMs that support INNS risk assessment

    AI naturalists might hold the key to unlocking biodiversity data in social media imagery

    The increasing availability of digital images, coupled with sophisticated artificial intelligence (AI) techniques for image classification, presents an exciting opportunity for biodiversity researchers to create new datasets of species observations. We investigated whether an AI plant species classifier could extract previously unexploited biodiversity data from social media photos (Flickr). We found over 60,000 geolocated images tagged with the keyword “flower” across an urban and rural location in the UK and classified these using AI, reviewing these identifications and assessing the representativeness of images. Images were predominantly biodiversity focused, showing single species. Non-native garden plants dominated, particularly in the urban setting. The AI classifier performed best when photos were focused on single native species in wild situations but also performed well at higher taxonomic levels (genus and family), even when images substantially deviated from this. We present a checklist of questions that should be considered when undertaking a similar analysis

    Towards a unified approach to formal risk of bias assessments for causal and descriptive inference

    Statistics is sometimes described as the science of reasoning under uncertainty. Statistical models provide one view of this uncertainty, but what is frequently neglected is the invisible portion of uncertainty: that assumed not to exist once a model has been fitted to some data. Systematic errors, i.e. bias, in data relative to some model and inferential goal can seriously undermine research conclusions, and qualitative and quantitative techniques have been created across several disciplines to quantify and generally appraise such potential biases. Perhaps best known are so-called risk of bias assessment instruments used to investigate the likely quality of randomised controlled trials in medical research. However, the logic of assessing the risks caused by various types of systematic error to statistical arguments applies far more widely. This logic applies even when statistical adjustment strategies for potential biases are used, as these frequently make assumptions (e.g. data missing at random) that can never be guaranteed in finite samples. Mounting concern about such situations can be seen in the increasing calls for greater consideration of biases caused by nonprobability sampling in descriptive inference (i.e. survey sampling), and the statistical generalisability of in-sample causal effect estimates in causal inference; both of which relate to the consideration of model-based and wider uncertainty when presenting research conclusions from models. Given that model-based adjustments are never perfect, we argue that qualitative risk of bias reporting frameworks for both descriptive and causal inferential arguments should be further developed and made mandatory by journals and funders. It is only through clear statements of the limits to statistical arguments that consumers of research can fully judge their value for any specific application.Comment: 12 page

    Descriptive inference using large, unrepresentative nonprobability samples: an introduction for ecologists

    Biodiversity monitoring usually involves drawing inferences about some variable of interest across a defined landscape from observations made at a sample of locations within that landscape. If the variable of interest differs between sampled and non-sampled locations, and no mitigating action is taken, then the sample is unrepresentative and inferences drawn from it will be biased. It is possible to adjust unrepresentative samples so that they more closely resemble the wider landscape in terms of “auxiliary variables”. A good auxiliary variable is a common cause of sample inclusion and the variable of interest, and if it explains an appreciable portion of the variance in both, then inferences drawn from the adjusted sample will be closer to the truth. We applied six types of survey sample adjustment—subsampling, quasi-randomisation, poststratification, superpopulation modelling, a “doubly robust” procedure, and multilevel regression and poststratification—to a simple two-part biodiversity monitoring problem. The first part was to estimate mean occupancy of the plant Calluna vulgaris in Great Britain in two time-periods (1987-1999 and 2010-2019); the second was to estimate the difference between the two (i.e. the trend). We estimated the means and trend using large, but (originally) unrepresentative, samples from a citizen science dataset. Compared to the unadjusted estimates, the means and trends estimated using most adjustment methods were more accurate, although standard uncertainty intervals generally did not cover the true values. Completely unbiased inference is not possible from an unrepresentative sample without knowing and having data on all relevant auxiliary variables. Adjustments can reduce the bias if auxiliary variables are available and selected carefully, but the potential for residual bias should be acknowledged and reported

    We need to talk about nonprobability samples

    In most circumstances, probability sampling is the only way to ensure unbiased inference about population quantities where a complete census is not possible. As we enter the era of ‘big data’, however, nonprobability samples, whose sampling mechanisms are unknown, are undergoing a renaissance. We explain why the use of nonprobability samples can lead to spurious conclusions, and why seemingly large nonprobability samples can be (effectively) very small. We also review some recent controversies surrounding the use of nonprobability samples in biodiversity monitoring. These points notwithstanding, we argue that nonprobability samples can be useful, provided that their limitations are assessed, mitigated where possible and clearly communicated. Ecologists can learn much from other disciplines on each of these fronts

    occAssess: an R package for assessing potential biases in species occurrence data

    Species occurrence records from a variety of sources are increasingly aggregated into heterogeneous databases and made available to ecologists for immediate analytical use. However, these data are typically biased, i.e. they are not a probability sample of the target population of interest, meaning that the information they provide may not be an accurate reflection of reality. It is therefore crucial that species occurrence data are properly scrutinised before they are used for research. In this article, we introduce occAssess, an R package that enables straightforward screening of species occurrence data for potential biases. The package contains a number of discrete functions, each of which returns a measure of the potential for bias in one or more of the taxonomic, temporal, spatial, and environmental dimensions. Users can opt to provide a set of time periods into which the data will be split; in this case separate outputs will be provided for each period, making the package particularly useful for assessing the suitability of a dataset for estimating temporal trends in species' distributions. The outputs are provided visually (as ggplot2 objects) and do not include a formal recommendation as to whether data are of sufficient quality for any given inferential use. Instead, they should be used as ancillary information and viewed in the context of the question that is being asked, and the methods that are being used to answer it. We demonstrate the utility of occAssess by applying it to data on two key pollinator taxa in South America: leaf-nosed bats (Phyllostomidae) and hoverflies (Syrphidae). In this worked example, we briefly assess the degree to which various aspects of data coverage appear to have changed over time. We then discuss additional applications of the package, highlight its limitations, and point to future development opportunities

    ï»żIntegrating expert knowledge at regional and national scales improves impact assessments of non-native species

    Knowledge of the impacts of invasive species is important for their management, prioritisation of control efforts and policy decisions. We investigated how British and Irish botanical experts assessed impacts at smaller scales in areas where they were familiar with the flora. Experts were asked to select the 10 plants that they considered were having the largest impacts in their areas. They also scored the local impacts of 10 plant species that had been previously scored to have the highest impacts at the scale of Great Britain. Impacts were scored using the modified classification scheme of the EICAT framework (Environmental Impact Classification for Alien Taxa). A total of 782 species/score combinations were received, of which 123 were non-native plants in 86 recording areas. Impatiens glandulifera, Reynoutria japonica and Rhododendron ponticum were the three species considered to have the highest impacts across all regions. Four of the species included in the list of the 10 highest impact species in Great Britain were also in the top 10 of species reported in our study. Species in the higher impact categories had, on average, a wider distribution than species with impacts categorised at lower levels. The main habitat types affected were woodlands, followed by linear/boundary features and freshwater habitats. Thirty-nine native plant species were reported to be negatively affected. In comparison to the overall non-native flora of Britain and Ireland, the lifeform spectrum of the species reported was significantly different, with higher percentages of aquatic plants and trees, but a lower proportion of annuals. The study demonstrates the value of local knowledge and expertise in identifying invasive species with negative impacts on the environment. Local knowledge is useful to both confirm national assessments and to identify species and impacts on native species and habitats that may not have gained national attention