79 research outputs found

    Developing an open data portal for the ESA climate change initiative

    Get PDF
    We introduce the rationale for, and architecture of, the European Space Agency Climate Change Initiative (CCI) Open Data Portal (http://cci.esa.int/data/). The Open Data Portal hosts a set of richly diverse datasets – 13 “Essential Climate Variables” – from the CCI programme in a consistent and harmonised form and to provides a single point of access for the (>100 TB) data for broad dissemination to an international user community. These data have been produced by a range of different institutions and vary across both scientific and spatio-temporal characteristics. This heterogeneity of the data together with the range of services to be supported presented significant technical challenges. An iterative development methodology was key to tackling these challenges: the system developed exploits a workflow which takes data that conforms to the CCI data specification, ingests it into a managed archive and uses both manual and automatically generated metadata to support data discovery, browse, and delivery services. It utilises both Earth System Grid Federation (ESGF) data nodes and the Open Geospatial Consortium Catalogue Service for the Web (OGC-CSW) interface, serving data into both the ESGF and the Global Earth Observation System of Systems (GEOSS). A key part of the system is a new vocabulary server, populated with CCI specific terms and relationships which integrates OGC-CSW and ESGF search services together, developed as part of a dialogue between domain scientists and linked data specialists. These services have enabled the development of a unified user interface for graphical search and visualisation – the CCI Open Data Portal Web Presence

    Mapping heterogeneous research infrastructure metadata into a unified catalogue for use in a generic virtual research environment

    Get PDF
    Virtual Research Environments (VREs), also known as science gateways or virtual laboratories, assist researchers in data science by integrating tools for data discovery, data retrieval, workflow management and researcher collaboration, often coupled with a specific computing infrastructure. Recently, the push for better open data science has led to the creation of a variety of dedicated research infrastructures (RIs) that gather data and provide services to different research communities, all of which can be used independently of any specific VRE. There is therefore a need for generic VREs that can be coupled with the resources of many different RIs simultaneously, easily customised to the needs of specific communities. The resource metadata produced by these RIs rarely all adhere to any one standard or vocabulary however, making it difficult to search and discover resources independently of their providers without some translation into a common framework. Cross-RI search can be expedited by using mapping services that harvest RI-published metadata to build unified resource catalogues, but the development and operation of such services pose a number of challenges. In this paper, we discuss some of these challenges and look specifically at the VRE4EIC Metadata Portal, which uses X3ML mappings to build a single catalogue for describing data products and other resources provided by multiple RIs. The Metadata Portal was built in accordance to the e-VRE Reference Architecture, a microservice-based architecture for generic modular VREs, and uses the CERIF standard to structure its catalogued metadata. We consider the extent to which it addresses the challenges of cross-RI search, particularly in the environmental and earth science domain, and how it can be further augmented, for example to take advantage of linked vocabularies to provide more intelligent semantic search across multiple domains of discourse

    Developing Feature Types and Related Catalogues for the Marine Community - Lessons from the MOTIIVE project.

    Get PDF
    MOTIIVE (Marine Overlays on Topography for annex II Valuation and Exploitation) is a project funded as a Specific Support Action (SSA) under the European Commission Framework Programme 6 (FP6) Aeronautics and Space Programme. The project started in September 2005 and finished in October 2007. The objective of MOTIIVE was to examine the methodology and cost benefit of using non-proprietary data standards. Specifically it considered the harmonisation requirements between the INSPIRE data component ‘elevation’ (terrestrial, bathymetric and coastal) and INSPIRE marine thematic data for ‘sea regions’, ‘oceanic spatial features’ and ‘coastal zone management areas’. This was examined in context of the requirements for interoperable information systems as required to realise the objectives of GMES for ‘global services’. The work draws particular conclusions on the realisation of Feature Types (ISO 19109) and Feature Type Catalogues (ISO 19110) in this respect. More information on MOTIIVE can be found at www.motiive.net

    DEIMS-SDR – A web portal to document research sites and their associated data

    Get PDF
    Climate change and other drivers are affecting ecosystems around the globe. In order to enable a better understanding of ecosystem functioning and to develop mitigation and adaptation strategies in response to environmental change, a broad range of information, including in-situ observations of both biotic and abiotic parameters, needs to be considered. Access to sufficient and well documented in-situ data from long term observations is therefore one of the key requirements for modelling and assessing the effects of global change on ecosystems. Usually, such data is generated by multiple providers; often not openly available and with improper documentation. In this regard, metadata plays an important role in aiding the findability, accessibility and reusability of data as well as enabling reproducibility of the results leading to management decisions. This metadata needs to include information on the observation location and the research context. For this purpose we developed the Dynamic Ecological Information Management System – Site and Dataset Registry (DEIMS-SDR), a research and monitoring site registry (https://www.deims.org/) that not only makes it possible to describe in-situ observation or experimentation sites, generating persistent, unique and resolvable identifiers for each site, but also to document associated data linked to each site. This article describes the system architecture and illustrates the linkage of contextual information to observational data. The aim of DEIMS-SDR is to be a globally comprehensive site catalogue describing a wide range of sites, providing a wealth of information, including each site's location, ecosystems, facilities, measured parameters and research themes and enabling that standardised information to be openly available

    STFC Centre for Environmental Data Archival (CEDA) Annual Report 2013 (April 2012-March 2013)

    Get PDF
    The mission of the Centre for Environmental Archival (CEDA) is to deliver long term curation of scientifically important environmental data at the same time as facilitating the use of data by the environmental science community. CEDA was established by the amalgamation of the activities of two of the Natural Environment Research Council (NERC) designated data centres: the British Atmospheric Data Centre, and the NERC Earth Observation Data Centre. We are pleased to present here our fourth annual report, covering activities for the 2013 year (April 2012 to March 2013). The report consists of two sections and appendices, the first section broadly providing a summary of activities and some statistics with some short descriptions of some significant activities, and a second section introducing some exemplar projects and activities. The report concludes with additional details of activities such as publications, software maintained etc

    Search improvement within the geospatial web in the context of spatial data infrastructures

    Get PDF
    El trabajo desarrollado en esta tesis doctoral demuestra que es posible mejorar la búsqueda en el contexto de las Infraestructuras de Datos Espaciales mediante la aplicación de técnicas y buenas prácticas de otras comunidades científicas, especialmente de las comunidades de la Web y de la Web Semántica (por ejemplo, Linked Data). El uso de las descripciones semánticas y las aproximaciones basadas en el contenido publicado por la comunidad geoespacial pueden ayudar en la búsqueda de información sobre los fenómenos geográficos, y en la búsqueda de recursos geoespaciales en general. El trabajo comienza con un análisis de una aproximación para mejorar la búsqueda de las entidades geoespaciales desde la perspectiva de geocodificación tradicional. La arquitectura de geocodificación compuesta propuesta en este trabajo asegura una mejora de los resultados de geocodificación gracias a la utilización de diferentes proveedores de información geográfica. En este enfoque, el uso de patrones estructurales de diseño y ontologías en esta aproximación permite una arquitectura avanzada en términos de extensibilidad, flexibilidad y adaptabilidad. Además, una arquitectura basada en la selección de servicio de geocodificación permite el desarrollo de una metodología de la georreferenciación de diversos tipos de información geográfica (por ejemplo, direcciones o puntos de interés). A continuación, se presentan dos aplicaciones representativas que requieren una caracterización semántica adicional de los recursos geoespaciales. El enfoque propuesto en este trabajo utiliza contenidos basados en heurísticas para el muestreo de un conjunto de recursos geopesaciales. La primera parte se dedica a la idea de la abstracción de un fenómeno geográfico de su definición espacial. La investigación muestra que las buenas prácticas de la Web Semántica se puede reutilizar en el ámbito de una Infraestructura de Datos Espaciales para describir los servicios geoespaciales estandarizados por Open Geospatial Consortium por medio de geoidentificadores (es decir, por medio de las entidades de una ontología geográfica). La segunda parte de este capítulo desglosa la aquitectura y componentes de un servicio de geoprocesamiento para la identificación automática de ortoimágenes ofrecidas a través de un servicio estándar de publicación de mapas (es decir, los servicios que siguen la especificación OGC Web Map Service). Como resultado de este trabajo se ha propuesto un método para la identificación de los mapas ofrecidos por un Web Map Service que son ortoimágenes. A continuación, el trabajo se dedica al análisis de cuestiones relacionadas con la creación de los metadatos de recursos de la Web en el contexto del dominio geográfico. Este trabajo propone una arquitectura para la generación automática de conocimiento geográfico de los recursos Web. Ha sido necesario desarrollar un método para la estimación de la cobertura geográfica de las páginas Web. Las heurísticas propuestas están basadas en el contenido publicado por os proveedores de información geográfica. El prototipo desarrollado es capaz de generar metadatos. El modelo generado contiene el conjunto mínimo recomendado de elementos requeridos por un catálogo que sigue especificación OGC Catalogue Service for the Web, el estandar recomendado por deiferentes Infraestructuras de Datos Espaciales (por ejemplo, the Infrastructure for Spatial Information in the European Community (INSPIRE)). Además, este estudio determina algunas características de la Web Geoespacial actual. En primer lugar, ofrece algunas características del mercado de los proveedores de los recursos Web de la información geográfica. Este estudio revela algunas prácticas de la comunidad geoespacial en la producción de metadatos de las páginas Web, en particular, la falta de metadatos geográficos. Todo lo anterior es la base del estudio de la cuestión del apoyo a los usuarios no expertos en la búsqueda de recursos de la Web Geoespacial. El motor de búsqueda dedicado a la Web Geoespacial propuesto en este trabajo es capaz de usar como base un motor de búsqueda existente. Por otro lado, da soporte a la búsqueda exploratoria de los recursos geoespaciales descubiertos en la Web. El experimento sobre la precisión y la recuperación ha demostrado que el prototipo desarrollado en este trabajo es al menos tan bueno como el motor de búsqueda remoto. Un estudio dedicado a la utilidad del sistema indica que incluso los no expertos pueden realizar una tarea de búsqueda con resultados satisfactorios

    Couplers for linking environmental models: scoping study and potential next steps

    Get PDF
    This report scopes out what couplers there are available in the hydrology and atmospheric modelling fields. The work reported here examines both dynamic runtime and one way file based coupling. Based on a review of the peer-reviewed literature and other open sources, there are a plethora of coupling technologies and standards relating to file formats. The available approaches have been evaluated against criteria developed as part of the DREAM project. Based on these investigations, the following recommendations are made: • The most promising dynamic coupling technologies for use within BGS are OpenMI 2.0 and CSDMS (either 1.0 or 2.0) • Investigate the use of workflow engines: Trident and Pyxis, the latter as part of the TSB/AHRC project “Confluence” • There is a need to include database standards CSW and GDAL and use data formats from the climate community NetCDF and CF standards. • Development of a “standard” composition which will consist of two process models and a 3D geological model all linked to data stored in the BGS corporate database and flat file format. Web Feature Services should be included in these compositions. There is also a need to investigate other approaches in different disciplines: The Loss Modelling Framework, OASIS-LMF is the best candidate

    INSPIRE – What if? Summary report from the What if…? sessions at the 2017 INSPIRE Conference

    Get PDF
    Following up from the successful "What if we didn't have INSPIRE?" workshop at the 2016 INSPIRE Conference in Barcelona and the "INSPIRE - What if...?" workshop at the OGC meeting in Delft1 in March 2017, two "INSPIRE - What if…?" sessions took place at the INSPIRE Conference in Strasbourg on 8 September 2017. This report explains the background to these sessions and provides a summary of the discussions in the break-out groups. Even though the six group discussions focused on different topics, the conclusions converged around the following recommendations: - Make INSPIRE easier to use for mainstream ICT professionals and developers - Focus on data content and on creating (preferably open) national or pan-European data sets, which are quality-assured and of high-value to a broad user community - Make INSPIRE more user-centric and user-driven - Improve communication and promote INSPIRE's success stories - Clarify the roles of the public and the private sectors, especially with respect to data offering(s), data integration and value adding servicesJRC.B.6-Digital Econom

    Integrating data and analysis technologies within leading environmental research infrastructures: Challenges and approaches

    Get PDF
    When researchers analyze data, it typically requires significant effort in data preparation to make the data analysis ready. This often involves cleaning, pre-processing, harmonizing, or integrating data from one or multiple sources and placing them into a computational environment in a form suitable for analysis. Research infrastructures and their data repositories host data and make them available to researchers, but rarely offer a computational environment for data analysis. Published data are often persistently identified, but such identifiers resolve onto landing pages that must be (manually) navigated to identify how data are accessed. This navigation is typically challenging or impossible for machines. This paper surveys existing approaches for improving environmental data access to facilitate more rapid data analyses in computational environments, and thus contribute to a more seamless integration of data and analysis. By analysing current state-of-the-art approaches and solutions being implemented by world‑leading environmental research infrastructures, we highlight the existing practices to interface data repositories with computational environments and the challenges moving forward. We found that while the level of standardization has improved during recent years, it still is challenging for machines to discover and access data based on persistent identifiers. This is problematic in regard to the emerging requirements for FAIR (Findable, Accessible, Interoperable, and Reusable) data, in general, and problematic for seamless integration of data and analysis, in particular. There are a number of promising approaches that would improve the state-of-the-art. A key approach presented here involves software libraries that streamline reading data and metadata into computational environments. We describe this approach in detail for two research infrastructures. We argue that the development and maintenance of specialized libraries for each RI and a range of programming languages used in data analysis does not scale well. Based on this observation, we propose a set of established standards and web practices that, if implemented by environmental research infrastructures, will enable the development of RI and programming language independent software libraries with much reduced effort required for library implementation and maintenance as well as considerably lower learning requirements on users. To catalyse such advancement, we propose a roadmap and key action points for technology harmonization among RIs that we argue will build the foundation for efficient and effective integration of data and analysis.This work was supported by the European Union’s Horizon 2020 research and innovation program under grant agreements No. 824068 (ENVRI-FAIR project) and No. 831558 (FAIR- sFAIR project). NEON is a project sponsored by the National Science Foundation (NSF) and managed under cooperative support agreement (EF-1029808) to Battell
    corecore