22 research outputs found

    Estimation of the Measurement Uncertainty of Ambient Air Pollution Datasets Using Geostatistical Analysis

    Get PDF
    We developed a methodology able to automatically estimate of measurement uncertainty in the air pollution data sets of AIRBase. The figures produced with this method were consistent with expectations from laboratory and field estimation of uncertainty and with the Data Quality Objectives of the European Directives. The proposed method based on geostatistical analysis is not able to estimate directly the measurement uncertainty. It estimates the nugget effect together with a micro-scale variability that must be minimized by accurate selection of the type of station. Based on the results obtained so far, it is likely that measurement uncertainty is best estimated using all background stations of whatever area type. So far the methodology has been used to estimate uncertainty in 4 different countries independently. This work should be continued for the whole Europe or for background station without national borders. The method has been shown to be also useful to compare the spatial continuity of air pollution in different countries that seems to be influenced by the topography of each country. Moreover, it may be used to quantify the trend of measurement uncertainty over long periods like decade with the possibility to evidence improvement in the data quality of AIRBase datasets. Thanks to the implemented outlier detection module that would also be of interest as the warning system when Member States report they measurement to the European Environment Agency, we have proposed an easy solution to investigate wrong classified stations in AIRBase.JRC.DDG.H.4-Transport and air qualit

    Screening tools for data quality and outlier detection applied to the Airbase ambient air pollution database

    Get PDF
    In order to provide scientifically sound information for regulatory purposes and environmental impact assessment, long term meso- to large-scale datasets of ambient air quality provide an indispensible means for model calibration, evaluation and validation. However, the collection of high quality datasets with suitable spatial coverage for air pollution management and decision support poses many challenges. It is thus critical to establish expedient tools for the efficient assessment and data quality control of air pollution measurements in large scale national and international monitoring networks. The European Environmental Agency collects, in the Air Quality Database named AirBase, measurements of ambient air pollution at more than 6000 monitoring stations from over 30 countries. The quality of these data depends on the chosen method of measurements and QA/QC procedures applied by each country. We present a methodology to automatically screen the AirBase records for internal consistency and to detect spatio-temporal outliers nested in the data. We implemented a spatial-set outlier detection method, which considers both attribute values and spatial relationships. Specifically, we adapted the “Smooth Spatial Attribute method” that was developed for the identification of outliers in traffic sensors. The method relies on the definition of a neighbourhood for each air pollutant measurement, corresponding to a spatio-temporal domain limited in time (+/- 1 day) and distance (+/- 1 degree) around location x. It is assumed that within a given spatio-temporal domain in which the attribute values of neighbours have a relationship due to the emission, transport and reaction of air pollutants, outliers will be detected by extreme values of their attributes compared to the attribute values of their neighbours. The implemented method can be of interest as a data quality screening system when countries report their measurements to the European Environment Agency. Beyond this, it could also provide a simple solution to investigate the accuracy of station classification in AirBase.JRC.H.2-Air and Climat

    EU gazetteer evaluation

    Get PDF
    This JRC technical report summarises the ELISE (European Location Interoperability Solutions for e-Government) activities in support to the development of an EU gazetteer. Most Member States have their own national gazetteer service so, if an EU gazetteer service is to be justified, there needs to be sufficient demand for pan-European applications or sufficient added value beyond existing national gazetteers. The ELISE Action of the ISA2 Programme carried out a survey in conjunction with EuroGeographics in 2018, aimed at understanding the demand-side and supply-side perspectives related to pan-European gazetteer data and services. The results clearly showed that there is demand for an EU gazetteer to support multi-national applications or complement existing national gazetteers, for purposes such as emergency response, searching for datasets, news items, or tourism / cultural heritage sites, validating foreign addresses, etc. This report further investigates two datasets on the pan-European level: Geographical names and Addresses as the most relevant datasets for the EU gazetteer. In the report we also analyse authoritative vs. volunteered spatial datasets. The results of the analysis showed that both data sources, official and volunteered, are complementary and mutually enhanced results can be obtained by combining the two. In addition, "Cultural Heritage Testbed" application has been developed with the aim to identify data, functionality gaps and improvements needed in different gazetteer solutions. The findings and possible applications were discussed with several existing use cases, with cross-border and pan-European coverage. Overall findings in this report can be used to justify the relevance and importance of Geographical names and Addresses datasets in the context of defining future high value datasets at an EU level.JRC.B.6-Digital Econom

    Digital Elevation Models: Terminology and Definitions

    Get PDF
    Digital elevation models (DEMs) provide fundamental depictions of the three-dimensional shape of the Earth’s surface and are useful to a wide range of disciplines. Ideally, DEMs record the interface between the atmosphere and the lithosphere using a discrete two-dimensional grid, with complexities introduced by the intervening hydrosphere, cryosphere, biosphere, and anthroposphere. The treatment of DEM surfaces, affected by these intervening spheres, depends on their intended use, and the characteristics of the sensors that were used to create them. DEM is a general term, and more specific terms such as digital surface model (DSM) or digital terrain model (DTM) record the treatment of the intermediate surfaces. Several global DEMs generated with optical (visible and near-infrared) sensors and synthetic aperture radar (SAR), as well as single/multi-beam sonars and products of satellite altimetry, share the common characteristic of a georectified, gridded storage structure. Nevertheless, not all DEMs share the same vertical datum, not all use the same convention for the area on the ground represented by each pixel in the DEM, and some of them have variable data spacings depending on the latitude. This paper highlights the importance of knowing, understanding and reflecting on the sensor and DEM characteristics and consolidates terminology and definitions of key concepts to facilitate a common understanding among the growing community of DEM users, who do not necessarily share the same background

    Allele Intersection Analysis: A Novel Tool for Multi Locus Sequence Assignment in Multiply Infected Hosts

    Get PDF
    Wolbachia are wide-spread, endogenous α-Proteobacteria of arthropods and filarial nematodes. 15–75% of all insect species are infected with these endosymbionts that alter their host's reproduction to facilitate their spread. In recent years, many insect species infected with multiple Wolbachia strains have been identified. As the endosymbionts are not cultivable outside living cells, strain typing relies on molecular methods. A Multi Locus Sequence Typing (MLST) system was established for standardizing Wolbachia strain identification. However, MLST requires hosts to harbour individual and not multiple strains of supergroups without recombination. This study revisits the applicability of the current MLST protocols and introduces Allele Intersection Analysis (AIA) as a novel approach. AIA utilizes natural variations in infection patterns and allows correct strain assignment of MLST alleles in multiply infected host species without the need of artificial strain segregation. AIA identifies pairs of multiply infected individuals that share Wolbachia and differ in only one strain. In such pairs, the shared MLST sequences can be used to assign alleles to distinct strains. Furthermore, AIA is a powerful tool to detect recombination events. The underlying principle of AIA may easily be adopted for MLST approaches in other uncultivable bacterial genera that occur as multiple strain infections and the concept may find application in metagenomic high-throughput parallel sequencing projects

    Global geospatial data from Earth observation: status and issues

    No full text
    Data covering the whole of the surface of the Earth in a homogeneous and reliable manner has been accumulating over many years. This type of data became available from meteorological satellites from the 1960s and from Earth-observing satellites at a small scale from the early 1970s but has gradually accumulated at larger scales up to the present day when we now have data covering many environmental themes at large scales. These data have been used to generate information which is presented in the form of global data sets. This paper will give a brief introduction to the development of Earth observation and to the organisations and sensors which collect data and produce global geospatial data sets. Means of accessing global data sets will set out the types of data available that will be covered. Digital elevation models are discussed in a separate section because of their importance in georeferencing image data as well as their application to analysis of thematic data. The paper will also examine issues of availability, accuracy, validation and reliability and will look at future challenges

    A spatio-temporal screening tool for outlier detection in long term / large scale air quality observation time series and monitoring networks

    No full text
    We present a consolidated screening tool for the detection of outliers in air quality monitoring data, which considers both attribute values and spatio-temporal relationships. Furthermore, an application example of warnings on abnormal values in time series of PM10 datasets in AirBase is presented. Spatial or temporal outliers in air quality datasets represent stations or individual measurements which differ significantly from other recordings within their spatio-temporal neighbourhood. Such abnormal values can be identified as being extreme compared to their neighbours, even though they do not necessarily require to differ significantly from the statistical distribution of the entire population. The identification of such outliers can be of interest as the basis of data quality control systems when several contributors report their measurements to the collection of larger datasets. Beyond this, it can also provide a simple solution to investigate the accuracy of station classifications. Seen from another viewpoint, it can be used as a tool to detect irregular air pollution emission events (e.g. the influence of fires, wind erosion events, or other accidental situations). The presented procedure for outlier detection was designed based on already existing literature. Specifically, we adapted the “Smooth Spatial Attribute Method” that was first developed for the identification of outlier values in networks of traffic sensors [1]. Since a free and extensible simulation platform was considered important, all codes were prototyped in the R environment which is available under the GNU General Public License [2]. Our algorithms are based on the definition of a neighbourhood for each air quality measurement, corresponding to a spatio-temporal domain limited by time (e.g., +/- 2 days) and distance (e.g., +/- 1 spherical degrees) around the location of ambient air monitoring stations. The objective of the method is that within such a given spatio-temporal domain, in which the attribute values of neighbours have a relationship due to the emission, transport and reaction of air pollutants, abnormal values can be detected by extreme values of their attributes compared to the attribute values of their neighbours. This comparison basically requires a spatio-temporal smoothing, i.e. a specific rule by which data points are averaged within a neighbourhood. The calculation of such reference basis has the effect of a low pass filter, meaning that high frequencies of the signal are removed from the data while preserving low frequencies. In this context, the choice of an appropriate kernel smoother function (e.g., nearest neighbour smoother, weighted kernel average smoother, etc.) is of particular importance. Our presentation will emphasize the effects bound to the selection of the corresponding weighting functions, like inverse squared normalized Euclidean distance or inverse squared Mahalanobis distance etc., and discuss the appropriateness and shortcomings of the different approaches. Corresponding parameter selections related to the extent of the spatio-temporal domain and the final test statistics for outlier thresholding are evaluated by sensitivity analysis.JRC.H.2-Air and Climat

    A Tool for the spatio-temporal screening of AirBase Datasets for abnormal values

    No full text
    In the Air Quality Database named AirBase, measurements of ambient air pollution are collected at more than 6000 monitoring stations from over 30 countries. The quality of these data depends on the chosen method of measurements and QA/QC procedures applied by each country. We present a novel methodology to automatically screen the AirBase records for internal consistency and to detect spatio-temporal outliers nested in the data. We implemented a spatio-temporal toolset for screening abnormal values which considers both attribute values and spatial relationships. The method relies on the definition of a neighbourhood for each air measurement, corresponding to a spatio-temporal domain limited in time and distance. It is assumed that within a given spatio-temporal domain in which the attribute values of neighbours have a relationship due to the emission, transport and reaction of air pollutants, abnormal values can be detected by extreme values of their attributes compared to the attribute values of their neighbours. The implemented method can be of interest as the basis of a data quality screening system when countries report their measurements to the European Environment Agency.JRC.H.2-Air and Climat
    corecore