9 research outputs found

    A unified framework for managing provenance information in translational research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A critical aspect of the NIH <it>Translational Research </it>roadmap, which seeks to accelerate the delivery of "bench-side" discoveries to patient's "bedside," is the management of the <it>provenance </it>metadata that keeps track of the origin and history of data resources as they traverse the path from the bench to the bedside and back. A comprehensive provenance framework is essential for researchers to verify the quality of data, reproduce scientific results published in peer-reviewed literature, validate scientific process, and associate trust value with data and results. Traditional approaches to provenance management have focused on only partial sections of the translational research life cycle and they do not incorporate "domain semantics", which is essential to support domain-specific querying and analysis by scientists.</p> <p>Results</p> <p>We identify a common set of challenges in managing provenance information across the <it>pre-publication </it>and <it>post-publication </it>phases of data in the translational research lifecycle. We define the semantic provenance framework (SPF), underpinned by the Provenir upper-level provenance ontology, to address these challenges in the four stages of provenance metadata:</p> <p>(a) Provenance <b>collection </b>- during data generation</p> <p>(b) Provenance <b>representation </b>- to support interoperability, reasoning, and incorporate domain semantics</p> <p>(c) Provenance <b>storage </b>and <b>propagation </b>- to allow efficient storage and seamless propagation of provenance as the data is transferred across applications</p> <p>(d) Provenance <b>query </b>- to support queries with increasing complexity over large data size and also support knowledge discovery applications</p> <p>We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for <it>Trypanosoma cruzi </it>(<it>T.cruzi </it>SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness.</p> <p>Conclusions</p> <p>The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis.</p

    Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs

    Get PDF
    Expressing machine-interpretable statements in the form of subject-predicate-object triples is a well-established practice for capturing semantics of structured data. However, the standard used for representing these triples, RDF, inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative. This paper is a critical review of data models, annotation frameworks, knowledge organization systems, serialization syntaxes, and algebras that enable provenance-aware RDF statements. The various approaches are assessed in terms of standard compliance, formal semantics, tuple type, vocabulary term usage, blank nodes, provenance granularity, and scalability. This can be used to advance existing solutions and help implementers to select the most suitable approach (or a combination of approaches) for their applications. Moreover, the analysis of the mechanisms and their limitations highlighted in this paper can serve as the basis for novel approaches in RDF-powered applications with increasing provenance needs

    Co-citation analysis of literature in e-science and e-infrastructures

    Get PDF
    This is the author accepted manuscript. The final version is available from Wiley via the DOI in this recordAdvances in computer networking, storage technologies and high-performance computing are helping global communities of researchers to address increasingly ambitious problems in Science collaboratively. EScience is the “science of this age”; it is realized through collaborative scientific enquiry which requires the utilization of non-trivial amounts of computing resources and massive data sets. Core to this is the integrated set of technologies collectively known as e-Infrastructures. In this paper, we explore the e-Science and the eInfrastructure knowledge base through co-citation analysis of existing literature. The dataset for this analysis is downloaded from the ISI Web of Science and includes over 12,000 articles. We identify prominent articles, authors and articles with citation bursts. The detection of research clusters and the underlying seminal papers provide further insights. Our analysis is an important source of reference for academics, researchers and students starting research in this field

    Enabling automatic provenance-based trust assessment of web content

    Get PDF

    Extending semantic provenance into the web of data

    No full text
    In this article, the authors provide an example workflow-and a simple classification of user questions on the workflow's data products-to combine and interchange contextual metadata through a semantic data model and infrastructure. They also analyze their approach's potential to support enhanced semantic provenance applications

    Extending semantic provenance into the web of data

    No full text
    corecore