22 research outputs found

    Provenance-based validation of E-science experiments

    No full text
    E-Science experiments typically involve many distributed services maintained by different organisations. After an experiment has been executed, it is useful for a scientist to verify that the execution was performed correctly or is compatible with some existing experimental criteria or standards. Scientists may also want to review and verify experiments performed by their colleagues. There are no existing frameworks for validating such experiments in today's e-Science systems. Users therefore have to rely on error checking performed by the services, or adopt other ad hoc methods. This paper introduces a platform-independent framework for validating workflow executions. The validation relies on reasoning over the documented provenance of experiment results and semantic descriptions of services advertised in a registry. This validation process ensures experiments are performed correctly, and thus results generated are meaningful. The framework is tested in a bioinformatics application that performs protein compressibility analysis

    Issues for the sharing and re-use of scientific workflows

    Get PDF
    In this paper, we outline preliminary findings from an ongoing study we have been conducting over the past 18 months of researchers’ use of myExperiment, a Web 2.0-based repository with a focus on social networking around shared research artefacts such as workflows. We present evidence of myExperiment users’ workflow sharing and re-use practices, motivations, concerns and potential barriers. The paper concludes with. a discussion of the implications of these our findings for community formation, diffusion of innovations, emerging drivers and incentives for research practice, and IT systems design

    Problem-solving methods for understanding process executions

    Get PDF
    Problem-solving methods are high-level, domain-independent, reusable knowledge templates that support the development of knowledge-intensive applications. The authors show how to use them to bolster subject-matter experts' understanding of process execution by implementing such methods into the Knowledge-Oriented Provenance Environment

    Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs

    Get PDF
    Expressing machine-interpretable statements in the form of subject-predicate-object triples is a well-established practice for capturing semantics of structured data. However, the standard used for representing these triples, RDF, inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative. This paper is a critical review of data models, annotation frameworks, knowledge organization systems, serialization syntaxes, and algebras that enable provenance-aware RDF statements. The various approaches are assessed in terms of standard compliance, formal semantics, tuple type, vocabulary term usage, blank nodes, provenance granularity, and scalability. This can be used to advance existing solutions and help implementers to select the most suitable approach (or a combination of approaches) for their applications. Moreover, the analysis of the mechanisms and their limitations highlighted in this paper can serve as the basis for novel approaches in RDF-powered applications with increasing provenance needs

    A provenance metadata model integrating ISO geospatial lineage and the OGC WPS : conceptual model and implementation

    Get PDF
    Nowadays, there are still some gaps in the description of provenance metadata. These gaps prevent the capture of comprehensive provenance, useful for reuse and reproducibility. In addition, the lack of automated tools for capturing provenance hinders the broad generation and compilation of provenance information. This work presents a provenance engine (PE) that captures and represents provenance information using a combination of the Web Processing Service (WPS) standard and the ISO 19115 geospatial lineage model. The PE, developed within the MiraMon GIS & RS software, automatically records detailed information about sources and processes. The PE also includes a metadata editor that shows a graphical representation of the provenance and allows users to complement provenance information by adding missing processes or deleting redundant process steps or sources, thus building a consistent geospatial workflow. One use case is presented to demonstrate the usefulness and effectiveness of the PE: the generation of a radiometric pseudo-invariant areas bench for the Iberian Peninsula. This remote-sensing use case shows how provenance can be automatically captured, also in a non-sequential complex flow, and its essential role in the automation and replication tasks in work with very large amounts of geospatial data

    An Architecture for Provenance Systems

    No full text
    This document covers the logical and process architectures of provenance systems. The logical architecture identifies key roles and their interactions, whereas the process architecture discusses distribution and security. A fundamental aspect of our presentation is its technology-independent nature, which makes it reusable: the principles that are exposed in this document may be applied to different technologies

    Provenance documentation to enable explainable and trustworthy AI: A literature review

    Get PDF
    ABSTRACTRecently artificial intelligence (AI) and machine learning (ML) models have demonstrated remarkable progress with applications developed in various domains. It is also increasingly discussed that AI and ML models and applications should be transparent, explainable, and trustworthy. Accordingly, the field of Explainable AI (XAI) is expanding rapidly. XAI holds substantial promise for improving trust and transparency in AI-based systems by explaining how complex models such as the deep neural network (DNN) produces their outcomes. Moreover, many researchers and practitioners consider that using provenance to explain these complex models will help improve transparency in AI-based systems. In this paper, we conduct a systematic literature review of provenance, XAI, and trustworthy AI (TAI) to explain the fundamental concepts and illustrate the potential of using provenance as a medium to help accomplish explainability in AI-based systems. Moreover, we also discuss the patterns of recent developments in this area and offer a vision for research in the near future. We hope this literature review will serve as a starting point for scholars and practitioners interested in learning about essential components of provenance, XAI, and TAI

    Abordagem da gestão de benefícios aplicada aos processos da preservação digital: um estudo de caso

    Get PDF
    Numa época em que a investigação científica utiliza cada vez mais recursos computacionais, com grandes volumes de dados e complexas relações entre si, surge a preocupação sobre a integridade do ciclo de vida dos dados científicos, assim como dos seus processos. As constantes colaborações em diferentes projetos e a reutilização de dados, de experiências anteriores, tornam as ferramentas e os processos, relativos à preservação digital de longo prazo, fundamentais para garantir o sucesso de futuros projetos. A implementação deste tipo de ferramentas implica grandes investimentos em sistemas e tecnologias de informação (SI/TI) e o seu impacto faz-se sentir em toda a organização, tanto a nível estratégico e financeiro, como operacional. Neste contexto, é essencial garantir o alinhamento entre o objetivo destes investimentos e as necessidades organizacionais, assegurando que os benefícios decorrentes destes investimentos sejam efetivamente realizados. O presente trabalho visa identificar os benefícios da implementação de ferramentas e processos de preservação digital numa instituição científica do setor público, utilizando, para isso, uma abordagem de gestão de benefíciosIn a time in which scientific research uses more and more computational resources, with large volumes of data and complex relations in between, comes the concern about the integrity of scientific data’s lifecycle and corresponding processes. Constant collaborations in different projects and the reuse of data, from previous experiences, make long time digital preservation tools and processes essential to ensure success on future projects. The implementation of such tools requires significant investments in information systems and technologies (IS/IT) and their impact is perceived throughout the organization, from a strategic and financial perspective to the operational one. In this context, it is crucial to guarantee the alignment between the objective of these investments and the organizational needs, to ensure that all benefits derived from these investments are actually achieved. The current work aims to identify the benefits of the implementation of digital preservation tools and processes in a scientific institution from the public sector, using, for that, a benefit management approach
    corecore