2 research outputs found

    Geospatial Workflows and Trust: a Use Case for Provenance

    Get PDF
    At first glance the Astronomer by Vermeer, Tutankhamun’s burial mask, and a geospatial workflow may appear to have nothing in common. However, a commonality exists; each of these items can have a record of provenance detailing their history. Provenance is a record that shows who did what to an object, where this happened, and how and why these actions took place. In relation to the geospatial domain, provenance can be used to track and analyze the changes data has undergone in a workflow, and can facilitate scientific reproducibility. Collecting provenance from geospatial workflows and finding effective ways to use this provenance is an important application. When using geospatial data in a workflow it is important to determine if the data and workflow used are trustworthy. This study examines whether provenance can be collected from a geospatial workflow. Each workflow examined is a use case for a specific type of geospatial problem. In addition to this, the collected provenance is then used to determine workflow trust and content trust for each of the workflows examined in this study. The results of this study determined that provenance can be collected from a geospatial workflow in such a way as to be of use to additional applications, such as provenance interchange. From this collected provenance, content trust and workflow trust can be estimated. The simple workflow had a content trust value of .83 (trustworthy) and a workflow trust value of .44 (untrustworthy). Two additional workflows were examined for content trust and workflow trust. The methods used to calculate content trust and workflow trust could also be expanded to other types of geospatial data and workflows. Future research could include complete automation of the provenance collection and trust calculations, as well as examining additional techniques for deciding trust in relation to workflows

    Curated Reasoning by Formal Modeling of Provenance

    Get PDF
    The core problem addressed in this research is the current lack of an ability to repurpose and curate scientific data among interdisciplinary scientists within a research enterprise environment. Explosive growth in sensor technology as well as the cost of collecting ocean data and airborne measurements has allowed for exponential increases in scientific data collection as well as substantial enterprise resources required for data collection. There is currently no framework for efficiently curating this scientific data for repurposing or intergenerational use. There are several reasons why this problem has eluded solution to date to include the competitive requirements for funding and publication, multiple vocabularies used among various scientific disciplines, the number of scientific disciplines and the variation among workflow processes, lack of a flexible framework to allow for diversity among vocabularies and data but a unifying approach to exploitation and a lack of affordable computing resources (mostly in past tense now). Addressing this lack of sharing scientific data among interdisciplinary scientists is an exceptionally challenging problem given the need for combination of various vocabularies, maintenance of associated scientific data provenance, requirement to minimize any additional workload being placed on originating data scientist project/time, protect publication/credit to reward scientific creativity and obtaining priority for a long-term goal such as scientific data curation for intergenerational, interdisciplinary scientific problem solving that likely offers the most potential for the highest impact discoveries in the future. This research approach focuses on the core technical problem of formally modeling interdisciplinary scientific data provenance as the enabling and missing component to demonstrate the potential of interdisciplinary scientific data repurposing. This research develops a framework to combine varying vocabularies in a formal manner that allows the provenance information to be used as a key for reasoning to allow manageable curation. The consequence of this research is that it has pioneered an approach of formally modeling provenance within an interdisciplinary research enterprise to demonstrate that intergenerational curation can be aided at the machine level to allow reasoning and repurposing to occur with minimal impact to data collectors and maximum impact to other scientists
    corecore