3,242 research outputs found

    Sciunits: Reusable Research Objects

    Full text link
    Science is conducted collaboratively, often requiring knowledge sharing about computational experiments. When experiments include only datasets, they can be shared using Uniform Resource Identifiers (URIs) or Digital Object Identifiers (DOIs). An experiment, however, seldom includes only datasets, but more often includes software, its past execution, provenance, and associated documentation. The Research Object has recently emerged as a comprehensive and systematic method for aggregation and identification of diverse elements of computational experiments. While a necessary method, mere aggregation is not sufficient for the sharing of computational experiments. Other users must be able to easily recompute on these shared research objects. In this paper, we present the sciunit, a reusable research object in which aggregated content is recomputable. We describe a Git-like client that efficiently creates, stores, and repeats sciunits. We show through analysis that sciunits repeat computational experiments with minimal storage and processing overhead. Finally, we provide an overview of sharing and reproducible cyberinfrastructure based on sciunits gaining adoption in the domain of geosciences

    Provenance-based trust for grid computing: Position Paper

    No full text
    Current evolutions of Internet technology such as Web Services, ebXML, peer-to-peer and Grid computing all point to the development of large-scale open networks of diverse computing systems interacting with one another to perform tasks. Grid systems (and Web Services) are exemplary in this respect and are perhaps some of the first large-scale open computing systems to see widespread use - making them an important testing ground for problems in trust management which are likely to arise. From this perspective, today's grid architectures suffer from limitations, such as lack of a mechanism to trace results and lack of infrastructure to build up trust networks. These are important concerns in open grids, in which "community resources" are owned and managed by multiple stakeholders, and are dynamically organised in virtual organisations. Provenance enables users to trace how a particular result has been arrived at by identifying the individual services and the aggregation of services that produced such a particular output. Against this background, we present a research agenda to design, conceive and implement an industrial-strength open provenance architecture for grid systems. We motivate its use with three complex grid applications, namely aerospace engineering, organ transplant management and bioinformatics. Industrial-strength provenance support includes a scalable and secure architecture, an open proposal for standardising the protocols and data structures, a set of tools for configuring and using the provenance architecture, an open source reference implementation, and a deployment and validation in industrial context. The provision of such facilities will enrich grid capabilities by including new functionalities required for solving complex problems such as provenance data to provide complete audit trails of process execution and third-party analysis and auditing. As a result, we anticipate that a larger uptake of grid technology is likely to occur, since unprecedented possibilities will be offered to users and will give them a competitive edge

    Sharing and Preserving Computational Analyses for Posterity with encapsulator

    Get PDF
    Open data and open-source software may be part of the solution to science's "reproducibility crisis", but they are insufficient to guarantee reproducibility. Requiring minimal end-user expertise, encapsulator creates a "time capsule" with reproducible code in a self-contained computational environment. encapsulator provides end-users with a fully-featured desktop environment for reproducible research.Comment: 11 pages, 6 figure
    corecore