3,229 research outputs found

    Probabilistic and dynamic optimization of job partitioning on a grid infrastructure

    Get PDF
    International audienceProduction grids have a potential for parallel execution of a very large number of tasks but also introduce a high overhead that significantly impacts the execution of short tasks. In this work, we present a strategy to optimize the partitioning of jobs on a grid infrastructure. This method takes into account the variability and the difficulty to model a multi-user large-scale environment used for production. It is based on probabilistic estimations of the grid overhead. We first study analytically modeled environments and then we show results on a real grid infrastructure. We demonstrate that this method leads to a significant time speed-up and to a substantial saving of the number of submitted tasks with respect to a blind maximal partitioning strategy

    A Simulated Annealing Method to Cover Dynamic Load Balancing in Grid Environment

    Get PDF
    High-performance scheduling is critical to the achievement of application performance on the computational grid. New scheduling algorithms are in demand for addressing new concerns arising in the grid environment. One of the main phases of scheduling on a grid is related to the load balancing problem therefore having a high-performance method to deal with the load balancing problem is essential to obtain a satisfactory high-performance scheduling. This paper presents SAGE, a new high-performance method to cover the dynamic load balancing problem by means of a simulated annealing algorithm. Even though this problem has been addressed with several different approaches only one of these methods is related with simulated annealing algorithm. Preliminary results show that SAGE not only makes it possible to find a good solution to the problem (effectiveness) but also in a reasonable amount of time (efficiency)

    An Approach to Model Resources Rationalisation in Hybrid Clouds through Users Activity Characterisation

    Get PDF
    In recent years, some strategies (e.g., server consolidation by means of virtualisation techniques) helped the managers of large Information Technology (IT) infrastructures to limit, when possible, the use of hardware resources in order to provide reliable services and to reduce the Total Cost of Ownership (TCO) of such infrastructures. Moreover, with the advent of Cloud computing, a resource usage rationalisation can be pursued also for the users applications, if this is compatible with the Quality of Service (QoS) which must be guaranteed. In this perspective, modern datacenters are “elastic”, i.e., able to shrink or enlarge the number of local physical or virtual resources from private/public Clouds. Moreover, many of large computing environments are integrated in distributed computing environment as the grid and cloud infrastructures. In this document, we report some advances in the realisation of a utility, we named Adaptive Scheduling Controller (ASC) which, interacting with the datacenter resource manager, allows an effective and efficient usage of resources, also by means of users jobs classification. Here, we focus both on some data mining algorithms which allows to classify the users activity and on the mathematical formalisation of the functional used by ASC to find the most suitable configuration for the datacenter’s resource manager. The presented case study concerns the SCoPE infrastructure, which has a twofold role: local computing resources provider for the University of Naples Federico II and remote resources provider for both the Italian Grid Infrastructure (IGI) and the European Grid Infrastructure (EGI) Federated Cloud

    A Service-Oriented Architecture enabling dynamic services grouping for optimizing distributed workflows execution

    Get PDF
    International audienceIn this paper, we describe a Service-Oriented Architecture allowing the optimization of the execution of service workflows. We discuss the advantages of the service-oriented approach with regard to the enactment of scientific applications on a grid infrastructure. Based on the development of a generic Web-Services wrapper, we show how the flexibility of our architecture enables dynamic service grouping for optimizing the application execution time. We demonstrate performance results on a real medical imaging application. On a production grid infrastructure, the optimization proposed introduces a significant speed-up (from 1.2 to 2.9) when compared to a traditional execution

    Autonomous grid scheduling using probabilistic job runtime scheduling

    Get PDF
    Computational Grids are evolving into a global, service-oriented architecture – a universal platform for delivering future computational services to a range of applications of varying complexity and resource requirements. The thesis focuses on developing a new scheduling model for general-purpose, utility clusters based on the concept of user requested job completion deadlines. In such a system, a user would be able to request each job to finish by a certain deadline, and possibly to a certain monetary cost. Implementing deadline scheduling is dependent on the ability to predict the execution time of each queued job, and on an adaptive scheduling algorithm able to use those predictions to maximise deadline adherence. The thesis proposes novel solutions to these two problems and documents their implementation in a largely autonomous and self-managing way. The starting point of the work is an extensive analysis of a representative Grid workload revealing consistent workflow patterns, usage cycles and correlations between the execution times of jobs and its properties commonly collected by the Grid middleware for accounting purposes. An automated approach is proposed to identify these dependencies and use them to partition the highly variable workload into subsets of more consistent and predictable behaviour. A range of time-series forecasting models, applied in this context for the first time, were used to model the job execution times as a function of their historical behaviour and associated properties. Based on the resulting predictions of job runtimes a novel scheduling algorithm is able to estimate the latest job start time necessary to meet the requested deadline and sort the queue accordingly to minimise the amount of deadline overrun. The testing of the proposed approach was done using the actual job trace collected from a production Grid facility. The best performing execution time predictor (the auto-regressive moving average method) coupled to workload partitioning based on three simultaneous job properties returned the median absolute percentage error centroid of only 4.75%. This level of prediction accuracy enabled the proposed deadline scheduling method to reduce the average deadline overrun time ten-fold compared to the benchmark batch scheduler. Overall, the thesis demonstrates that deadline scheduling of computational jobs on the Grid is achievable using statistical forecasting of job execution times based on historical information. The proposed approach is easily implementable, substantially self-managing and better matched to the human workflow making it well suited for implementation in the utility Grids of the future
    • …
    corecore