1,668 research outputs found

    Provenance-based trust for grid computing: Position Paper

    No full text
    Current evolutions of Internet technology such as Web Services, ebXML, peer-to-peer and Grid computing all point to the development of large-scale open networks of diverse computing systems interacting with one another to perform tasks. Grid systems (and Web Services) are exemplary in this respect and are perhaps some of the first large-scale open computing systems to see widespread use - making them an important testing ground for problems in trust management which are likely to arise. From this perspective, today's grid architectures suffer from limitations, such as lack of a mechanism to trace results and lack of infrastructure to build up trust networks. These are important concerns in open grids, in which "community resources" are owned and managed by multiple stakeholders, and are dynamically organised in virtual organisations. Provenance enables users to trace how a particular result has been arrived at by identifying the individual services and the aggregation of services that produced such a particular output. Against this background, we present a research agenda to design, conceive and implement an industrial-strength open provenance architecture for grid systems. We motivate its use with three complex grid applications, namely aerospace engineering, organ transplant management and bioinformatics. Industrial-strength provenance support includes a scalable and secure architecture, an open proposal for standardising the protocols and data structures, a set of tools for configuring and using the provenance architecture, an open source reference implementation, and a deployment and validation in industrial context. The provision of such facilities will enrich grid capabilities by including new functionalities required for solving complex problems such as provenance data to provide complete audit trails of process execution and third-party analysis and auditing. As a result, we anticipate that a larger uptake of grid technology is likely to occur, since unprecedented possibilities will be offered to users and will give them a competitive edge

    Distinguishing Provenance Equivalence of Earth Science Data

    Get PDF
    Reproducibility of scientific research relies on accurate and precise citation of data and the provenance of that data. Earth science data are often the result of applying complex data transformation and analysis workflows to vast quantities of data. Provenance information of data processing is used for a variety of purposes, including understanding the process and auditing as well as reproducibility. Certain provenance information is essential for producing scientifically equivalent data. Capturing and representing that provenance information and assigning identifiers suitable for precisely distinguishing data granules and datasets is needed for accurate comparisons. This paper discusses scientific equivalence and essential provenance for scientific reproducibility. We use the example of an operational earth science data processing system to illustrate the application of the technique of cascading digital signatures or hash chains to precisely identify sets of granules and as provenance equivalence identifiers to distinguish data made in an an equivalent manner

    Beyond OAIS : towards a reliable and consistent digital preservation implementation framework

    Get PDF
    Current work in digital preservation (DP) is dominated by the "Open Archival Information System" (OAIS) reference framework specified by the international standard ISO 14721:2003. This is a useful aid to understanding the concepts, main functional components and the basic data flows within a DP system, but does not give specific guidance on implementation-level issues. In this paper we suggest that there is a need for a reference architecture which goes beyond OAIS to address such implementationlevel issues - to specify minimum requirements in respect of the policies, processes, and metadata required to measure and validate repository trustworthiness in respect of the authenticity, integrity, renderability, meaning, and retrievability of the digital materials preserved. The suggestion is not that a particular way of implementing OAIS be specified, but, rather that general guidelines on implementation are required if the term 'OAIS-compliant' is to be meaningful in the sense of giving an assurance of attaining and maintaining an operationally adequate or better level of long-term reliability, consistency, and crosscompatibility in implemented DP systems that is measurable, verifiable, manageable, and (as far as possible) futureproofed

    Open data and the academy: an evaluation of CKAN for research data management

    Get PDF
    This paper offers a full and critical evaluation of the open source CKAN software (http://ckan.org) for use as a Research Data Management (RDM) tool within a university environment. It presents a case study of CKAN's implementation and use at the University of Lincoln, UK, and highlights its strengths and current weaknesses as an institutional Research Data Management tool. The author draws on his prior experience of implementing a mixed media Digital Asset Management system (DAM), Institutional Repository (IR) and institutional Web Content Management System (CMS), to offer an outline proposal for how CKAN can be used effectively for data analysis, storage and publishing in academia. This will be of interest to researchers, data librarians, and developers, who are responsible for the implementation of institutional RDM infrastructure. This paper is presented as part of the dissemination activities of the Jisc-funded Orbital project (http://orbital.blogs.lincoln.ac.uk

    Audit and Certification of Digital Repositories: Creating a Mandate for the Digital Curation Centre (DCC)

    Get PDF
    The article examines the issues surrounding the audit and certification of digital repositories in light of the work that the RLG/NARA Task Force did to draw up guidelines and the need for these guidelines to be validated.

    DBKnot: A Transparent and Seamless, Pluggable Tamper Evident Database

    Get PDF
    Database integrity is crucial to organizations that rely on databases of important data. They suffer from the vulnerability to internal fraud. Database tampering by internal malicious employees with high technical authorization to their infrastructure or even compromised by externals is one of the important attack vectors. This thesis addresses such challenge in a class of problems where data is appended only and is immutable. Examples of operations where data does not change is a) financial institutions (banks, accounting systems, stock market, etc., b) registries and notary systems where important data is kept but is never subject to change, and c) system logs that must be kept intact for performance and forensic inspection if needed. The target of the approach is implementation seamlessness with little-or-no changes required in existing systems. Transaction tracking for tamper detection is done by utilizing a common hashtable that serially and cumulatively hashes transactions together while using an external time-stamper and signer to sign such linkages together. This allows transactions to be tracked without any of the organizations’ data leaving their premises and going to any third-party which also reduces the performance impact of tracking. This is done so by adding a tracking layer and embedding it inside the data workflow while keeping it as un-invasive as possible. DBKnot implements such features a) natively into databases, or b) embedded inside Object Relational Mapping (ORM) frameworks, and finally c) outlines a direction of implementing it as a stand-alone microservice reverse-proxy. A prototype ORM and database layer has been developed and tested for seamlessness of integration and ease of use. Additionally, different models of optimization by implementing pipelining parallelism in the hashing/signing process have been tested in order to check their impact on performance. Stock-market information was used for experimentation with DBKnot and the initial results gave a slightly less than 100% increase in transaction time by using the most basic, sequential, and synchronous version of DBKnot. Signing and hashing overhead does not show significant increase per record with the increased amount of data. A number of different alternate optimizations were done to the design that via testing have resulted in significant increase in performance

    Extending Capability and Implementing a Web Interface for the XALT Software Monitoring Tool

    Get PDF
    As high performance computing centers evolve in terms of hardware, software, and user-base, the act of monitoring and managing such systems requires specialized tools. The tool discussed in this thesis is XALT, which is a collaborative effort between the National Institute for Computational Sciences and Texas Advanced Computing Center. XALT is designed to track link-time and job level information for applications that are compiled and executed on any Linux cluster, workstation, or high-end supercomputer. The key objectives of this work are to extend the existing functionality of XALT and implement a real-time web portal to easily visualize the tracked data. A prototype is developed to track function calls resolved by external libraries which helps software management. The web portal generates reports and metrics which would improve efficiency and effectiveness for an extensive community of stakeholders including users, support organizations, and development teams. In addition, we discuss use cases of interest to center support staff and researchers on identifying users based on given counters and generating provenance reports. This work details the opportunity and challenges to further push XALT towards becoming a complete package

    A model for digital preservation repository risk relationships

    Get PDF
    The paper introduces the Preserved Object and Repository Risk Ontology (PORRO), a model that relates preservation functionality with associated risks and opportunities for their mitigation. Building on work undertaken in a range of EU and UK funded research projects (including the Digital Curation Centre , DigitalPreservationEurope and DELOS ), this ontology illustrates relationships between fundamental digital library goals and their parameters; associated rights and responsibilities; practical activities and resources involved in their accomplishment; and risks facing digital libraries and their collections. Its purpose is to facilitate a comprehensive understanding of risk causality and to illustrate opportunities for mitigation and avoidance. The ontology reflects evidence accumulated from a series of institutional audits and evaluations, including a specific subset of digital libraries in the DELOS project which led to the definition of a digital library preservation risk profile. Its applicability is intended to be widespread, and its coverage expected to evolve to reflect developments within the community. Attendees will gain an understanding of the model and learn how they can utilize this online resource to inform their own risk management activities

    100 Percent Population Testing and Concerns of Auditors With Limited Liability Exposure: Evidence From Retail Investors

    Get PDF
    This study investigates whether the type of evidence collected during an audit reduces retail investors concerns of CPA firms whose liability exposure is limited during an audit engagement. This study is motivated by calls for further research pertaining to the benefits and effectiveness realized by the use of big data during the audit engagement. Retail investors who responded to a 2 x 2 experiment did not perceive CPA firms’ independence to be impaired, regardless of the audit methodology used to gather audit evidence (100 percent population testing versus traditional sampling). These results are useful to accounting lawmakers who previously expressed concern that a reduction of liability would impair auditors’ judgement during the audit. Similarly, these results may assist accounting lawmakers in deciding whether or how to change auditing standards to reflect the benefits of big data in auditing. This study was approved by the Institutional Review Board
    corecore