1,953 research outputs found

    Abstracting PROV provenance graphs:A validity-preserving approach

    Get PDF
    Data provenance is a structured form of metadata designed to record the activities and datasets involved in data production, as well as their dependency relationships. The PROV data model, released by the W3C in 2013, defines a schema and constraints that together provide a structural and semantic foundation for provenance. This enables the interoperable exchange of provenance between data producers and consumers. When the provenance content is sensitive and subject to disclosure restrictions, however, a way of hiding parts of the provenance in a principled way before communicating it to certain parties is required. In this paper we present a provenance abstraction operator that achieves this goal. It maps a graphical representation of a PROV document PG1 to a new abstract version PG2, ensuring that (i) PG2 is a valid PROV graph, and (ii) the dependencies that appear in PG2 are justified by those that appear in PG1. These two properties ensure that further abstraction of abstract PROV graphs is possible. A guiding principle of the work is that of minimum damage: the resultant graph is altered as little as possible, while ensuring that the two properties are maintained. The operator developed is implemented as part of a user tool, described in a separate paper, that lets owners of sensitive provenance information control the abstraction by specifying an abstraction policy.</p

    Reactive Rules for Emergency Management

    Get PDF
    The goal of the following survey on Event-Condition-Action (ECA) Rules is to come to a common understanding and intuition on this topic within EMILI. Thus it does not give an academic overview on Event-Condition-Action Rules which would be valuable for computer scientists only. Instead the survey tries to introduce Event-Condition-Action Rules and their use for emergency management based on real-life examples from the use-cases identified in Deliverable 3.1. In this way we hope to address both, computer scientists and security experts, by showing how the Event-Condition-Action Rule technology can help to solve security issues in emergency management. The survey incorporates information from other work packages, particularly from Deliverable D3.1 and its Annexes, D4.1, D2.1 and D6.2 wherever possible

    Reproducibility and Replicability in Unmanned Aircraft Systems and Geographic Information Science

    Get PDF
    Multiple scientific disciplines face a so-called crisis of reproducibility and replicability (R&R) in which the validity of methodologies is questioned due to an inability to confirm experimental results. Trust in information technology (IT)-intensive workflows within geographic information science (GIScience), remote sensing, and photogrammetry depends on solutions to R&R challenges affecting multiple computationally driven disciplines. To date, there have only been very limited efforts to overcome R&R-related issues in remote sensing workflows in general, let alone those tied to disruptive technologies such as unmanned aircraft systems (UAS) and machine learning (ML). To accelerate an understanding of this crisis, a review was conducted to identify the issues preventing R&R in GIScience. Key barriers included: (1) awareness of time and resource requirements, (2) accessibility of provenance, metadata, and version control, (3) conceptualization of geographic problems, and (4) geographic variability between study areas. As a case study, a replication of a GIScience workflow utilizing Yolov3 algorithms to identify objects in UAS imagery was attempted. Despite the ability to access source data and workflow steps, it was discovered that the lack of accessibility to provenance and metadata of each small step of the work prohibited the ability to successfully replicate the work. Finally, a novel method for provenance generation was proposed to address these issues. It was found that artificial intelligence (AI) could be used to quickly create robust provenance records for workflows that do not exceed time and resource constraints and provide the information needed to replicate work. Such information can bolster trust in scientific results and provide access to cutting edge technology that can improve everyday life

    Reproducibility and Replicability in Unmanned Aircraft Systems and Geographic Information Science

    Get PDF
    Multiple scientific disciplines face a so-called crisis of reproducibility and replicability (R&R) in which the validity of methodologies is questioned due to an inability to confirm experimental results. Trust in information technology (IT)-intensive workflows within geographic information science (GIScience), remote sensing, and photogrammetry depends on solutions to R&R challenges affecting multiple computationally driven disciplines. To date, there have only been very limited efforts to overcome R&R-related issues in remote sensing workflows in general, let alone those tied to disruptive technologies such as unmanned aircraft systems (UAS) and machine learning (ML). To accelerate an understanding of this crisis, a review was conducted to identify the issues preventing R&R in GIScience. Key barriers included: (1) awareness of time and resource requirements, (2) accessibility of provenance, metadata, and version control, (3) conceptualization of geographic problems, and (4) geographic variability between study areas. As a case study, a replication of a GIScience workflow utilizing Yolov3 algorithms to identify objects in UAS imagery was attempted. Despite the ability to access source data and workflow steps, it was discovered that the lack of accessibility to provenance and metadata of each small step of the work prohibited the ability to successfully replicate the work. Finally, a novel method for provenance generation was proposed to address these issues. It was found that artificial intelligence (AI) could be used to quickly create robust provenance records for workflows that do not exceed time and resource constraints and provide the information needed to replicate work. Such information can bolster trust in scientific results and provide access to cutting edge technology that can improve everyday life

    EMBRACE (EMbedding Repositories And Consortial Enhancement) project: final report

    Get PDF
    EMBRACE (EMBedding Repositories And Consortial Enhancement) was an 18-month project led by UCL on behalf of the SHERPA-LEAP (London Eprints Access Project) Consortium, a group of 13 University of London institutions with institutional repositories. The project had two strands, technical and strategic. In its technical strand, EMBRACE aimed to implement a number of technical improvements to enhance the functionality of the SHERPA-LEAP repositories. In a concurrent strategic strand, EMBRACE set out to investigate the challenges of embedding repositories of digital assets in institutional strategy to ensure repository sustainability. The technical work of the project resulted in the successful enhancement of the partner repositories, and a cover page generating tool has been released on an open source basis. The strategic work delivered two main outputs: a full report on the work of RAND in drawing on stakeholder interviews which identifies drivers for, and barriers to, repository sustainability; and a supplementary, "briefing paper" digest of the main report, concentrating on the interventions which can be taken by repository managers and champions to address the challenges of embedding repositories. Both documents are in the public domain. The Briefing Paper is explicitly designed for adaptation and local customisation by HEIs. The RAND report emphasises the importance of establishing a clear vision for the repository, and of close communication with stakeholders, if a repository is to succeed

    BlogForever: D3.1 Preservation Strategy Report

    Get PDF
    This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design

    DoubleCheck: Designing Community-based Assessability for Historical Person Identification

    Full text link
    Historical photos are valuable for their cultural and economic significance, but can be difficult to identify accurately due to various challenges such as low-quality images, lack of corroborating evidence, and limited research resources. Misidentified photos can have significant negative consequences, including lost economic value, incorrect historical records, and the spread of misinformation that can lead to perpetuating conspiracy theories. To accurately assess the credibility of a photo identification (ID), it may be necessary to conduct investigative research, use domain knowledge, and consult experts. In this paper, we introduce DoubleCheck, a quality assessment framework for verifying historical photo IDs on Civil War Photo Sleuth (CWPS), a popular online platform for identifying American Civil War-era photos using facial recognition and crowdsourcing. DoubleCheck focuses on improving CWPS's user experience and system architecture to display information useful for assessing the quality of historical photo IDs on CWPS. In a mixed-methods evaluation of DoubleCheck, we found that users contributed a wide diversity of sources for photo IDs, which helped facilitate the community's assessment of these IDs through DoubleCheck's provenance visualizations. Further, DoubleCheck's quality assessment badges and visualizations supported users in making accurate assessments of photo IDs, even in cases involving ID conflicts.Comment: Accepted to ACM Journal on Computing and Cultural Heritage (JOCCH

    Secure Time-Aware Provenance for Distributed Systems

    Get PDF
    Operators of distributed systems often find themselves needing to answer forensic questions, to perform a variety of managerial tasks including fault detection, system debugging, accountability enforcement, and attack analysis. In this dissertation, we present Secure Time-Aware Provenance (STAP), a novel approach that provides the fundamental functionality required to answer such forensic questions ā€“ the capability to ā€œexplainā€ the existence (or change) of a certain distributed system state at a given time in a potentially adversarial environment. This dissertation makes the following contributions. First, we propose the STAP model, to explicitly represent time and state changes. The STAP model allows consistent and complete explanations of system state (and changes) in dynamic environments. Second, we show that it is both possible and practical to efficiently and scalably maintain and query provenance in a distributed fashion, where provenance maintenance and querying are modeled as recursive continuous queries over distributed relations. Third, we present security extensions that allow operators to reliably query provenance information in adversarial environments. Our extensions incorporate tamper-evident properties that guarantee eventual detection of compromised nodes that lie or falsely implicate correct nodes. Finally, the proposed research results in a proof-of-concept prototype, which includes a declarative query language for specifying a range of useful provenance queries, an interactive exploration tool, and a distributed provenance engine for operators to conduct analysis of their distributed systems. We discuss the applicability of this tool in several use cases, including Internet routing, overlay routing, and cloud data processing

    Knowledge visualizations: a tool to achieve optimized operational decision making and data integration

    Get PDF
    The overabundance of data created by modern information systems (IS) has led to a breakdown in cognitive decision-making. Without authoritative source data, commandersā€™ decision-making processes are hindered as they attempt to paint an accurate shared operational picture (SOP). Further impeding the decision-making process is the lack of proper interface interaction to provide a visualization that aids in the extraction of the most relevant and accurate data. Utilizing the DSS to present visualizations based on OLAP cube integrated data allow decision-makers to rapidly glean information and build their situation awareness (SA). This yields a competitive advantage to the organization while in garrison or in combat. Additionally, OLAP cube data integration enables analysis to be performed on an organizationā€™s data-flows. This analysis is used to identify the critical path of data throughout the organization. Linking a decision-maker to the authoritative data along this critical path eliminates the many decision layers in a hierarchal command structure that can introduce latency or error into the decision-making process. Furthermore, the organization has an integrated SOP from which to rapidly build SA, and make effective and efficient decisions.http://archive.org/details/knowledgevisuali1094545877Outstanding ThesisOutstanding ThesisMajor, United States Marine CorpsCaptain, United States Marine CorpsApproved for public release; distribution is unlimited
    • ā€¦
    corecore