11 research outputs found

    A simulated ‘sandbox’ for exploring the modifiable areal unit problem in aggregation and disaggregation

    Get PDF
    We present a spatial testbed of simulated boundary data based on a set of very high-resolution census-based areal units surrounding Guadalajara, Mexico. From these input areal units, we simulated 10 levels of spatial resolutions, ranging from levels with 5,515–52,388 units and 100 simulated zonal configurations for each level – totalling 1,000 simulated sets of areal units. These data facilitate interrogating various realizations of the data and the effects of the spatial coarseness and zonal configurations, the Modifiable Areal Unit Problem (MAUP), on applications such as model training, model prediction, disaggregation, and aggregation processes. Further, these data can facilitate the production of spatially explicit, non-parametric estimates of confidence intervals via bootstrapping. We provide a pre-processed version of these 1,000 simulated sets of areal units, meta- and summary data to assist in their use, and a code notebook with the means to alter and/or reproduce these data

    Assessing the influence of landscape conservation and protected areas on social wellbeing using random forest machine learning

    No full text
    Abstract The urgency of interconnected social-ecological dilemmas such as rapid biodiversity loss, habitat loss and fragmentation, and the escalating climate crisis have led to increased calls for the protection of ecologically important areas of the planet. Protected areas (PA) are considered critical to address these dilemmas although growing divides in wellbeing can exacerbate conflict around PAs and undermine effectiveness. We investigate the influence of proximity to PAs on wellbeing outcomes. We develop a novel multi-dimensional index of wellbeing for households and across Africa and use Random Forest Machine Learning techniques to assess the importance score of households’ proximity to protected areas on their wellbeing outcomes compared with the importance scores of an array of other social, environmental, and local and national governance factors. This study makes important contributions to the conservation literature, first by expanding the ways in which wellbeing is measured and operationalized, and second, by providing additional empirical support for recent evidence that proximity to PAs is an influential factor affecting observed wellbeing outcomes, albeit likely through different pathways than the current literature suggests

    Determining Global Population Distribution: Methods, Applications and Data

    No full text
    Evaluating the total numbers of people at risk from infectious disease in the world requires not just tabular population data, but data that are spatially explicit and global in extent at a moderate resolution. This review describes the basic methods for constructing estimates of global population distribution with attention to recent advances in improving both spatial and temporal resolution. To evaluate the optimal resolution for the study of disease, the native resolution of the data inputs as well as that of the resulting outputs are discussed. Assumptions used to produce different population data sets are also described, with their implications for the study of infectious disease. Lastly, the application of these population data sets in studies to assess disease distribution and health impacts is reviewed. The data described in this review are distributed in the accompanying DVD.JRC.H.3-Global environement monitorin

    Gridded population maps informed by different built settlement products

    No full text
    The spatial distribution of humans on the earth is critical knowledge that informs many disciplines and is available in a spatially explicit manner through gridded population techniques. While many approaches exist to produce specialized gridded population maps, little has been done to explore how remotely sensed, built-area datasets might be used to dasymetrically constrain these estimates. This study presents the effectiveness of three different high-resolution built area datasets for producing gridded population estimates through the dasymetric disaggregation of census counts in Haiti, Malawi, Madagascar, Nepal, Rwanda, and Thailand. Modeling techniques include a binary dasymetric redistribution, a random forest with a dasymetric component, and a hybrid of the previous two. The relative merits of these approaches and the data are discussed with regards to studying human populations and related spatially explicit phenomena. Results showed that the accuracy of random forest and hybrid models was comparable in five of six countries

    Evaluating nighttime lights and population distribution as proxies for mapping anthropogenic CO2 emission in Vietnam, Cambodia and Laos

    No full text
    Tracking spatiotemporal changes in GHG emissions is key to successful implementation of the United Nations Framework Convention on Climate Change (UNFCCC). And while emission inventories often provide a robust tool to track emission trends at the country level, subnational emission estimates are often not reported or reports vary in robustness as the estimates are often dependent on the spatial modeling approach and ancillary data used to disaggregate the emission inventories. Assessing the errors and uncertainties of the subnational emission estimates is fundamentally challenging due to the lack of physical measurements at the subnational level. To begin addressing the current performance of modeled gridded CO2 emissions, this study compares two common proxies used to disaggregate CO2 emission estimates. We use a known gridded CO2 model based on satellite-observed nighttime light (NTL) data (Open Source Data Inventory for Anthropogenic CO2, ODIAC) and a gridded population dataset driven by a set of ancillary geospatial data. We examine the association at multiple spatial scales of these two datasets for three countries in Southeast Asia: Vietnam, Cambodia and Laos and characterize the spatiotemporal similarities and differences for 2000, 2005, and 2010. We specifically highlight areas of potential uncertainty in the ODIAC model, which relies on the single use of NTL data for disaggregation of the non-point emissions estimates. Results show, over time, how a NTL-based emissions disaggregation tends to concentrate CO2 estimates in different ways than population-based estimates at the subnational level. We discuss important considerations in the disconnect between the two modeled datasets and argue that the spatial differences between data products can be useful to identify areas affected by the errors and uncertainties associated with the NTL-based downscaling in a region with uneven urbanization rates

    Towards an improved large-scale gridded population dataset: a Pan-European study on the integration of 3D settlement data into population modelling

    Get PDF
    Large-scale gridded population datasets available at the global or continental scale have become an important source of information in applications related to sustainable development. In recent years, the emergence of new population models has leveraged the inclusion of more accurate and spatially detailed proxy layers describing the built-up environment (e.g., built-area and building footprint datasets), enhancing the quality, accuracy and spatial resolution of existing products. However, due to the consistent lack of vertical and functional information on the built-up environ-ment, large-scale gridded population datasets that rely on existing built-up land proxies still report large errors of under-and overestimation, especially in areas with predominantly high-rise buildings or industrial/commercial areas, respectively. This research investigates, for the first time, the potential contributions of the new World Settlement Footprint—3D (WSF3D) dataset in the field of large-scale population modelling. First, we combined a Random Forest classifier with spatial metrics derived from the WSF3D to predict the industrial versus non-industrial use of settlement pixels at the Pan-European scale. We then examined the effects of including volume and settlement use information into frameworks of dasymetric population modelling. We found that the proposed classification method can predict industrial and non-industrial areas with overall accuracies and a kappa-coefficient of ~84% and 0.68, respectively. Additionally, we found that both, integrating volume and settlement use information considerably increased the accuracy of population estimates between 10% and 30% over commonly employed models (e.g., based on a binary settlement mask as input), mainly by eliminating systematic large overestimations in industrial/commercial areas. While the proposed method shows strong promise for overcoming some of the main limitations in large-scale population modelling, future research should focus on improving the quality of the WFS3D dataset and the classification method alike, to avoid the false detection of built-up settlements and to reduce misclassification errors of industrial and high-rise buildings.</p

    Global spatio-temporally harmonised datasets for producing high-resolution gridded population distribution datasets

    No full text
    Multi-temporal, globally consistent, high-resolution human population datasets provide consistent and comparable population distributions in support of mapping sub-national heterogeneities in health, wealth, and resource access, and monitoring change in these over time. The production of more reliable and spatially detailed population datasets is increasingly necessary due to the importance of improving metrics at sub-national and multi-temporal scales. This is in support of measurement and monitoring of UN Sustainable Development Goals and related agendas. In response to these agendas, a method has been developed to assemble and harmonise a unique, open access, archive of geospatial datasets. Datasets are provided as global, annual time series, where pertinent at the timescale of population analyses and where data is available, for use in the construction of population distribution layers. The archive includes sub-national census-based population estimates, matched to a geospatial layer denoting administrative unit boundaries, and a number of co-registered gridded geospatial factors that correlate strongly with population presence and density. Here, we describe these harmonised datasets and their limitations, along with the production workflow. Further, we demonstrate applications of the archive by producing multi-temporal gridded population outputs for Africa and using these to derive health and development metrics
    corecore