17,624 research outputs found

    Towards Automatic Capturing of Manual Data Processing Provenance

    Get PDF
    Often data processing is not implemented by a work ow system or an integration application but is performed manually by humans along the lines of a more or less specified procedure. Collecting provenance information during manual data processing can not be automated. Further, manual collection of provenance information is error prone and time consuming. Therefore, we propose to infer provenance information based on the read and write access of users. The derived provenance information is complete, but has a low precision. Therefore, we propose further to introducing organizational guidelines in order to improve the precision of the inferred provenance information

    Understanding personal data as a space - learning from dataspaces to create linked personal data

    No full text
    In this paper we argue that the space of personal data is a dataspace as defined by Franklin et al. We define a personal dataspace, as the space of all personal data belonging to a user, and we describe the logical components of the dataspace. We describe a Personal Dataspace Support Platform (PDSP) as a set of services to provide a unified view over the user’s data, and to enable new and more complex workflows over it. We show the differences from a DSSP to a PDSP, and how the latter can be realized using Web protocols and Linked APIs.<br/

    A model and framework for reliable build systems

    Full text link
    Reliable and fast builds are essential for rapid turnaround during development and testing. Popular existing build systems rely on correct manual specification of build dependencies, which can lead to invalid build outputs and nondeterminism. We outline the challenges of developing reliable build systems and explore the design space for their implementation, with a focus on non-distributed, incremental, parallel build systems. We define a general model for resources accessed by build tasks and show its correspondence to the implementation technique of minimum information libraries, APIs that return no information that the application doesn't plan to use. We also summarize preliminary experimental results from several prototype build managers

    Leveraging HTC for UK eScience with very large Condor pools: demand for transforming untapped power into results

    Get PDF
    We provide an insight into the demand from the UK eScience community for very large HighThroughput Computing resources and provide an example of such a resource in current productionuse: the 930-node eMinerals Condor pool at UCL. We demonstrate the significant benefits thisresource has provided to UK eScientists via quickly and easily realising results throughout a rangeof problem areas. We demonstrate the value added by the pool to UCL I.S infrastructure andprovide a case for the expansion of very large Condor resources within the UK eScience Gridinfrastructure. We provide examples of the technical and administrative difficulties faced whenscaling up to institutional Condor pools, and propose the introduction of a UK Condor/HTCworking group to co-ordinate the mid to long term UK eScience Condor development, deploymentand support requirements, starting with the inaugural UK Condor Week in October 2004

    Automation of the Continuous Integration (CI) - Continuous Delivery/Deployment (CD) Software Development

    Get PDF
    Continuous Integration (CI) is a practice in software development where developers periodically merge code changes in a central shared repository, after which automatic versions and tests are executed. CI entails an automation component (the target of this project) and a cultural one, as developers have to learn to integrate code periodically. The main goal of CI is to reduce the time to feedback over the software integration process, allowing to locate and fix bugs more easily and quickly, thus enhancing it quality while reducing the time to validate and publish new soIn traditional software development, where teams of developers worked on the same project in isolation, often led to problems integrating the resulting code. Due to this isolation, the project was not deliverable until the integration of all its parts, which was tedious and generated errors. The Continuous Integration (CI ) emerged as a practice to solve the problems of traditional methodology, with the aim of improving the quality of the code. This thesis sets out what is it and how Continuous Integration is achieved, the principles that makes it as effective as possible and the processes that follow as a consequence, to thus introduce the context of its objective: the creation of a system that automates the start-up and set-up of an environment to be able to apply the methodology of continuous integration
    • …
    corecore