185 research outputs found

    Energy-aware simulation of workflow execution in High Throughput Computing systems

    Get PDF
    Workflows offer a great potential for enacting corelated jobs in an automated manner. This is especially desirable when workflows are large or there is a desire to run a workflow multiple times. Much research has been conducted in reducing the makespan of running workflows and maximising the utilisation of the resources they run on, with some existing research investigates how to reduce the energy consumption of workflows on dedicated resources. We extend the HTC-Sim simulation framework to support workflows allowing us to evaluate different scheduling strategies on the overheads and energy consumption of workflows run on non-dedicated systems. We evaluate a number of scheduling strategies from the literature in an environment where (workflow) jobs can be evicted by higher priority users

    How much data do I need? A case study on medical data

    Full text link
    The collection of data to train a Deep Learning network is costly in terms of effort and resources. In many cases, especially in a medical context, it may have detrimental impacts. Such as requiring invasive medical procedures or processes which could in themselves cause medical harm. However, Deep Learning is seen as a data hungry method. Here, we look at two commonly held adages i) more data gives better results and ii) transfer learning will aid you when you don't have enough data. These are widely assumed to be true and used as evidence for choosing how to solve a problem when Deep Learning is involved. We evaluate six medical datasets and six general datasets. Training a ResNet18 network on varying subsets of these datasets to evaluate `more data gives better results'. We take eleven of these datasets as the sources for Transfer Learning on subsets of the twelfth dataset -- Chest -- in order to determine whether Transfer Learning is universally beneficial. We go further to see whether multi-stage Transfer Learning provides a consistent benefit. Our analysis shows that the real situation is more complex than these simple adages -- more data could lead to a case of diminishing returns and an incorrect choice of dataset for transfer learning can lead to worse performance, with datasets which we would consider highly similar to the Chest dataset giving worse results than datasets which are more dissimilar. Multi-stage transfer learning likewise reveals complex relationships between datasets.Comment: 10 pages, 7 figure

    Stochastic Workflow Scheduling with QoS Guarantees in Grid Computing Environments

    Get PDF
    Grid computing infrastructures embody a cost-effective computing paradigm that virtualises heterogenous system resources to meet the dynamic needs of critical business and scientific applications. These applications range from batch processes and long-running tasks to more real-time and even transactional applications. Grid schedulers aim to make efficient use of Grid resources in a cost-effective way, while satisfying the Quality-of-Service requirements of the applications. Scheduling in such a large-scale, dynamic and distributed environment is a complex undertaking. In this paper, we propose an approach to Grid scheduling which abstracts over the details of individual applications and aims to provide a globally optimal schedule, while having the ability to dynamically adjust to varying workloa
    • …
    corecore