29,398 research outputs found

    Data mining and fusion

    No full text

    A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD

    Full text link
    In order to make results of computational scientific research findable, accessible, interoperable and re-usable, it is necessary to decorate them with standardised metadata. However, there are a number of technical and practical challenges that make this process difficult to achieve in practice. Here the implementation of a protocol is presented to tag crystal structures with their computed properties, without the need of human intervention to curate the data. This protocol leverages the capabilities of AiiDA, an open-source platform to manage and automate scientific computational workflows, and TCOD, an open-access database storing computed materials properties using a well-defined and exhaustive ontology. Based on these, the complete procedure to deposit computed data in the TCOD database is automated. All relevant metadata are extracted from the full provenance information that AiiDA tracks and stores automatically while managing the calculations. Such a protocol also enables reproducibility of scientific data in the field of computational materials science. As a proof of concept, the AiiDA-TCOD interface is used to deposit 170 theoretical structures together with their computed properties and their full provenance graphs, consisting in over 4600 AiiDA nodes

    Semantic Modeling of Analytic-based Relationships with Direct Qualification

    Full text link
    Successfully modeling state and analytics-based semantic relationships of documents enhances representation, importance, relevancy, provenience, and priority of the document. These attributes are the core elements that form the machine-based knowledge representation for documents. However, modeling document relationships that can change over time can be inelegant, limited, complex or overly burdensome for semantic technologies. In this paper, we present Direct Qualification (DQ), an approach for modeling any semantically referenced document, concept, or named graph with results from associated applied analytics. The proposed approach supplements the traditional subject-object relationships by providing a third leg to the relationship; the qualification of how and why the relationship exists. To illustrate, we show a prototype of an event-based system with a realistic use case for applying DQ to relevancy analytics of PageRank and Hyperlink-Induced Topic Search (HITS).Comment: Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015

    Towards structured sharing of raw and derived neuroimaging data across existing resources

    Full text link
    Data sharing efforts increasingly contribute to the acceleration of scientific discovery. Neuroimaging data is accumulating in distributed domain-specific databases and there is currently no integrated access mechanism nor an accepted format for the critically important meta-data that is necessary for making use of the combined, available neuroimaging data. In this manuscript, we present work from the Derived Data Working Group, an open-access group sponsored by the Biomedical Informatics Research Network (BIRN) and the International Neuroimaging Coordinating Facility (INCF) focused on practical tools for distributed access to neuroimaging data. The working group develops models and tools facilitating the structured interchange of neuroimaging meta-data and is making progress towards a unified set of tools for such data and meta-data exchange. We report on the key components required for integrated access to raw and derived neuroimaging data as well as associated meta-data and provenance across neuroimaging resources. The components include (1) a structured terminology that provides semantic context to data, (2) a formal data model for neuroimaging with robust tracking of data provenance, (3) a web service-based application programming interface (API) that provides a consistent mechanism to access and query the data model, and (4) a provenance library that can be used for the extraction of provenance data by image analysts and imaging software developers. We believe that the framework and set of tools outlined in this manuscript have great potential for solving many of the issues the neuroimaging community faces when sharing raw and derived neuroimaging data across the various existing database systems for the purpose of accelerating scientific discovery

    Utilising Provenance to Enhance Social Computation

    Get PDF
    Postprin

    Architecture for Provenance Systems

    No full text
    This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies
    • …
    corecore