1,340 research outputs found

    Darwin Core: An Evolving Community-Developed Biodiversity Data Standard

    Get PDF
    Biodiversity data derive from myriad sources stored in various formats on many distinct hardware and software platforms. An essential step towards understanding global patterns of biodiversity is to provide a standardized view of these heterogeneous data sources to improve interoperability. Fundamental to this advance are definitions of common terms. This paper describes the evolution and development of Darwin Core, a data standard for publishing and integrating biodiversity information. We focus on the categories of terms that define the standard, differences between simple and relational Darwin Core, how the standard has been implemented, and the community processes that are essential for maintenance and growth of the standard. We present case-study extensions of the Darwin Core into new research communities, including metagenomics and genetic resources. We close by showing how Darwin Core records are integrated to create new knowledge products documenting species distributions and changes due to environmental perturbations

    An ontology to standardize research output of nutritional epidemiology : from paper-based standards to linked content

    Get PDF
    Background: The use of linked data in the Semantic Web is a promising approach to add value to nutrition research. An ontology, which defines the logical relationships between well-defined taxonomic terms, enables linking and harmonizing research output. To enable the description of domain-specific output in nutritional epidemiology, we propose the Ontology for Nutritional Epidemiology (ONE) according to authoritative guidance for nutritional epidemiology. Methods: Firstly, a scoping review was conducted to identify existing ontology terms for reuse in ONE. Secondly, existing data standards and reporting guidelines for nutritional epidemiology were converted into an ontology. The terms used in the standards were summarized and listed separately in a taxonomic hierarchy. Thirdly, the ontologies of the nutritional epidemiologic standards, reporting guidelines, and the core concepts were gathered in ONE. Three case studies were included to illustrate potential applications: (i) annotation of existing manuscripts and data, (ii) ontology-based inference, and (iii) estimation of reporting completeness in a sample of nine manuscripts. Results: Ontologies for food and nutrition (n = 37), disease and specific population (n = 100), data description (n = 21), research description (n = 35), and supplementary (meta) data description (n = 44) were reviewed and listed. ONE consists of 339 classes: 79 new classes to describe data and 24 new classes to describe the content of manuscripts. Conclusion: ONE is a resource to automate data integration, searching, and browsing, and can be used to assess reporting completeness in nutritional epidemiology

    Multimedia Annotation Interoperability Framework

    Get PDF
    Multimedia systems typically contain digital documents of mixed media types, which are indexed on the basis of strongly divergent metadata standards. This severely hamplers the inter-operation of such systems. Therefore, machine understanding of metadata comming from different applications is a basic requirement for the inter-operation of distributed Multimedia systems. In this document, we present how interoperability among metadata, vocabularies/ontologies and services is enhanced using Semantic Web technologies. In addition, it provides guidelines for semantic interoperability, illustrated by use cases. Finally, it presents an overview of the most commonly used metadata standards and tools, and provides the general research direction for semantic interoperability using Semantic Web technologies

    Data documentation & metadata

    Get PDF

    2017 DWH Long-Term Data Management Coordination Workshop Report

    Get PDF
    On June 7 and 8, 2017, the Coastal Response Research Center (CRRC)[1], NOAA Office of Response and Restoration (ORR) and NOAA National Marine Fisheries Service (NMFS) Restoration Center (RC), co-sponsored the Deepwater Horizon Oil Spill (DWH) Long Term Data Management (LTDM) workshop at the ORR Gulf of Mexico (GOM) Disaster Response Center (DRC) in Mobile, AL. There has been a focus on restoration planning, implementation and monitoring of the on-going DWH-related research in the wake of the DWH Natural Resource Damage Assessment (NRDA) settlement. This means that data management, accessibility, and distribution must be coordinated among various federal, state, local, non-governmental organizations (NGOs), academic, and private sector partners. The scope of DWH far exceeded any other spill in the U.S. with an immense amount of data (e.g., 100,000 environmental samples, 15 million publically available records) gathered during the response and damage assessment phases of the incident as well as data that continues to be produced from research and restoration efforts. The challenge with the influx in data is checking the quality, documenting data collection, storing data, integrating it into useful products, managing it and archiving it for long term use. In addition, data must be available to the public in an easily queried and accessible format. Answering questions regarding the success of the restoration efforts will be based on data generated for years to come. The data sets must be readily comparable, representative and complete; be collected using cross-cutting field protocols; be as interoperable as possible; meet standards for quality assurance/quality control (QA/QC); and be unhindered by conflicting or ambiguous terminology. During the data management process for the NOAA Natural Resource Damage Assessment (NRDA) for the DWH disaster, NOAA developed a data management warehouse and visualization system that will be used as a long term repository for accessing/archiving NRDA injury assessment data. This serves as a foundation for the restoration project planning and monitoring data for the next 15 or more years. The main impetus for this workshop was to facilitate public access to the DWH data collected and managed by all entities by developing linkages to or data exchanges among applicable GOM data management systems. There were 66 workshop participants (Appendix A) representing a variety of organizations who met at NOAA’s GOM Disaster Response Center (DRC) in order to determine the characteristics of a successful common operating picture for DWH data, to understand the systems that are currently in place to manage DWH data, and make the DWH data interoperable between data generators, users and managers. The external partners for these efforts include, but are not limited to the: RESTORE Council, Gulf of Mexico Research Initiative (GoMRI), Gulf of Mexico Research Initiative Information and Data Cooperative (GRIIDC), the National Academy of Sciences (NAS) Gulf Research Program, Gulf of Mexico Alliance (GOMA), and National Fish and Wildlife Foundation (NFWF). The workshop objectives were to: Foster collaboration among the GOM partners with respect to data management and integration for restoration planning, implementation and monitoring; Identify standards, protocols and guidance for LTDM being used by these partners for DWH NRDA, restoration, and public health efforts; Obtain feedback and identify next steps for the work completed by the Environmental Disasters Data Management (EDDM) Working Groups; and Work towards best practices on public distribution and access of this data. The workshop consisted of plenary presentations and breakout sessions. The workshop agenda (Appendix B) was developed by the organizing committee. The workshop presentations topics included: results of a pre-workshop survey, an overview of data generation, the uses of DWH long term data, an overview of LTDM, an overview of existing LTDM systems, an overview of data management standards/ protocols, results from the EDDM working groups, flow diagrams of existing data management systems, and a vision on managing big data. The breakout sessions included discussions of: issues/concerns for data stakeholders (e.g., data users, generators, managers), interoperability, ease of discovery/searchability, data access, data synthesis, data usability, and metadata/data documentation. [1] A list of acronyms is provided on Page 1 of this report

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    A Two-Level Information Modelling Translation Methodology and Framework to Achieve Semantic Interoperability in Constrained GeoObservational Sensor Systems

    Get PDF
    As geographical observational data capture, storage and sharing technologies such as in situ remote monitoring systems and spatial data infrastructures evolve, the vision of a Digital Earth, first articulated by Al Gore in 1998 is getting ever closer. However, there are still many challenges and open research questions. For example, data quality, provenance and heterogeneity remain an issue due to the complexity of geo-spatial data and information representation. Observational data are often inadequately semantically enriched by geo-observational information systems or spatial data infrastructures and so they often do not fully capture the true meaning of the associated datasets. Furthermore, data models underpinning these information systems are typically too rigid in their data representation to allow for the ever-changing and evolving nature of geo-spatial domain concepts. This impoverished approach to observational data representation reduces the ability of multi-disciplinary practitioners to share information in an interoperable and computable way. The health domain experiences similar challenges with representing complex and evolving domain information concepts. Within any complex domain (such as Earth system science or health) two categories or levels of domain concepts exist. Those concepts that remain stable over a long period of time, and those concepts that are prone to change, as the domain knowledge evolves, and new discoveries are made. Health informaticians have developed a sophisticated two-level modelling systems design approach for electronic health documentation over many years, and with the use of archetypes, have shown how data, information, and knowledge interoperability among heterogenous systems can be achieved. This research investigates whether two-level modelling can be translated from the health domain to the geo-spatial domain and applied to observing scenarios to achieve semantic interoperability within and between spatial data infrastructures, beyond what is possible with current state-of-the-art approaches. A detailed review of state-of-the-art SDIs, geo-spatial standards and the two-level modelling methodology was performed. A cross-domain translation methodology was developed, and a proof-of-concept geo-spatial two-level modelling framework was defined and implemented. The Open Geospatial Consortium’s (OGC) Observations & Measurements (O&M) standard was re-profiled to aid investigation of the two-level information modelling approach. An evaluation of the method was undertaken using II specific use-case scenarios. Information modelling was performed using the two-level modelling method to show how existing historical ocean observing datasets can be expressed semantically and harmonized using two-level modelling. Also, the flexibility of the approach was investigated by applying the method to an air quality monitoring scenario using a technologically constrained monitoring sensor system. This work has demonstrated that two-level modelling can be translated to the geospatial domain and then further developed to be used within a constrained technological sensor system; using traditional wireless sensor networks, semantic web technologies and Internet of Things based technologies. Domain specific evaluation results show that twolevel modelling presents a viable approach to achieve semantic interoperability between constrained geo-observational sensor systems and spatial data infrastructures for ocean observing and city based air quality observing scenarios. This has been demonstrated through the re-purposing of selected, existing geospatial data models and standards. However, it was found that re-using existing standards requires careful ontological analysis per domain concept and so caution is recommended in assuming the wider applicability of the approach. While the benefits of adopting a two-level information modelling approach to geospatial information modelling are potentially great, it was found that translation to a new domain is complex. The complexity of the approach was found to be a barrier to adoption, especially in commercial based projects where standards implementation is low on implementation road maps and the perceived benefits of standards adherence are low. Arising from this work, a novel set of base software components, methods and fundamental geo-archetypes have been developed. However, during this work it was not possible to form the required rich community of supporters to fully validate geoarchetypes. Therefore, the findings of this work are not exhaustive, and the archetype models produced are only indicative. The findings of this work can be used as the basis to encourage further investigation and uptake of two-level modelling within the Earth system science and geo-spatial domain. Ultimately, the outcomes of this work are to recommend further development and evaluation of the approach, building on the positive results thus far, and the base software artefacts developed to support the approach
    • …
    corecore