17 research outputs found

    Grid-job scheduling with reservations and preemption

    Get PDF
    Computational grids make it possible to exploit grid resources across multiple clusters when grid jobs are deconstructed into tasks and allocated across clusters. Grid-job tasks are often scheduled in the form of workflows which require synchronization, and advance reservation makes it easy to guarantee predictable resource provisioning for these jobs. However, advance reservation for grid jobs creates roadblocks and fragmentation which adversely affects the system utilization and response times for local jobs. We provide a solution which incorporates relaxed reservations and uses a modified version of the standard grid-scheduling algorithm, HEFT, to obtain flexibility in placing reservations for workflow grid jobs. Furthermore, we deploy the relaxed reservation with modified HEFT as an extension of the preemption based job scheduling framework, SCOJO-PECT job scheduler. In SCOJO-PECT, relaxed reservations serve the additional purpose of permitting scheduler optimizations which shift the overall schedule forward. Furthermore, a propagation heuristics algorithm is used to alleviate the workflow job makespan extension caused by the slack of relaxed reservation. Our solution aims at decreasing the fragmentation caused by grid jobs, so that local jobs and system utilization are not compromised, and at the same time grid jobs also have reasonable response times

    Towards ServMark, an Architecture for Testing Grid Services

    Get PDF
    Technical University of Delft - Technical Report ServMark-2006-002, July 2006Grid computing provides a natural way to aggregate resources from different administrative domains for building large scale distributed environments. The Web Services paradigm proposes a way by which virtual services can be seamlessly integrated into global-scale solutions to complex problems. While the usage of Grid technology ranges from academia and research to business world and production, two issues must be considered: that the promised functionality can be accurately quantified and that the performance can be evaluated based on well defined means. Without adequate functionality demonstrators, systems cannot be tuned or adequately configured, and Web services cannot be stressed adequately in production environment. Without performance evaluation systems, the system design and procurement processes are limp, and the performance of Web Services in production cannot be assessed. In this paper, we present ServMark, a carefully researched tool for Grid performance evaluation. While we acknowledge that a lot of ground must be covered to fulfill the requirements of a system for testing Grid environments, and Web (and Grid) Services, we believe that ServMark addresses the minimal set of critical issues

    Group-based optimization for parallel job scheduling in clusters via heuristic search

    Get PDF
    Job scheduling for parallel processing typically makes scheduling decisions on a per job basis due to the dynamic arrival of jobs. Such decision making provides limited options to find globally best schedules. Most research uses off-line optimization which is not realistic. We propose an optimization on the basis of limited-size dynamic job grouping per priority class. We apply heuristic domain-knowledge-based hi-level search and branch-and-bound methods to heavy workload traces to capture good schedules. Special plan-based conservative backfilling and shifting policies are used to augment the search. Our objective is to minimize average relative response times for long and medium job classes, while keeping utilization high. The scheduling algorithm is extended from the SCOJO-PECT coarse-grain pre-emptive time-sharing scheduler. The proposed scheduler was evaluated using real traces and Lublin-Feitelson synthetic workload model. The comparisons were made with the conservative SCOJO-PECT scheduler. The results are promising--the average relative response times were improved by 18-32 while still able to contain the loss of utilization within 2

    Integrating multiple clusters for compute-intensive applications

    Get PDF
    Multicluster grids provide one promising solution to satisfying the growing computational demands of compute-intensive applications. However, it is challenging to seamlessly integrate all participating clusters in different domains into a single virtual computational platform. In order to fully utilize the capabilities of multicluster grids, computer scientists need to deal with the issue of joining together participating autonomic systems practically and efficiently to execute grid-enabled applications. Driven by several compute-intensive applications, this theses develops a multicluster grid management toolkit called Pelecanus to bridge the gap between user\u27s needs and the system\u27s heterogeneity. Application scientists will be able to conduct very large-scale execution across multiclusters with transparent QoS assurance. A novel model called DA-TC (Dynamic Assignment with Task Containers) is developed and is integrated into Pelecanus. This model uses the concept of a task container that allows one to decouple resource allocation from resource binding. It employs static load balancing for task container distribution and dynamic load balancing for task assignment. The slowest resources become useful rather than be bottlenecks in this manner. A cluster abstraction is implemented, which not only provides various cluster information for the DA-TC execution model, but also can be used as a standalone toolkit to monitor and evaluate the clusters\u27 functionality and performance. The performance of the proposed DA-TC model is evaluated both theoretically and experimentally. Results demonstrate the importance of reducing queuing time in decreasing the total turnaround time for an application. Experiments were conducted to understand the performance of various aspects of the DA-TC model. Experiments showed that our model could significantly reduce turnaround time and increase resource utilization for our targeted application scenarios. Four applications are implemented as case studies to determine the applicability of the DA-TC model. In each case the turnaround time is greatly reduced, which demonstrates that the DA-TC model is efficient for assisting application scientists in conducting their research. In addition, virtual resources were integrated into the DA-TC model for application execution. Experiments show that the execution model proposed in this thesis can work seamlessly with multiple hybrid grid/cloud resources to achieve reduced turnaround time

    An Inter-Cloud Meta-Scheduling (ICMS) simulation framework: architecture and evaluation

    Get PDF
    Inter-cloud is an approach that facilitates scalable resource provisioning across multiple cloud infrastructures. In this paper, we focus on the performance optimization of Infrastructure as a Service (IaaS) using the meta-scheduling paradigm to achieve an improved job scheduling across multiple clouds. We propose a novel inter-cloud job scheduling framework and implement policies to optimize performance of participating clouds. The framework, named as Inter-Cloud Meta-Scheduling (ICMS), is based on a novel message exchange mechanism to allow optimization of job scheduling metrics. The resulting system offers improved flexibility, robustness and decentralization. We implemented a toolkit named “Simulating the Inter-Cloud” (SimIC) to perform the design and implementation of different inter-cloud entities and policies in the ICMS framework. An experimental analysis is produced for job executions in inter-cloud and a performance is presented for a number of parameters such as job execution, makespan, and turnaround times. The results highlight that the overall performance of individual clouds for selected parameters and configuration is improved when these are brought together under the proposed ICMS framework

    The Inter-cloud meta-scheduling

    Get PDF
    Inter-cloud is a recently emerging approach that expands cloud elasticity. By facilitating an adaptable setting, it purposes at the realization of a scalable resource provisioning that enables a diversity of cloud user requirements to be handled efficiently. This study’s contribution is in the inter-cloud performance optimization of job executions using metascheduling concepts. This includes the development of the inter-cloud meta-scheduling (ICMS) framework, the ICMS optimal schemes and the SimIC toolkit. The ICMS model is an architectural strategy for managing and scheduling user services in virtualized dynamically inter-linked clouds. This is achieved by the development of a model that includes a set of algorithms, namely the Service-Request, Service-Distribution, Service-Availability and Service-Allocation algorithms. These along with resource management optimal schemes offer the novel functionalities of the ICMS where the message exchanging implements the job distributions method, the VM deployment offers the VM management features and the local resource management system details the management of the local cloud schedulers. The generated system offers great flexibility by facilitating a lightweight resource management methodology while at the same time handling the heterogeneity of different clouds through advanced service level agreement coordination. Experimental results are productive as the proposed ICMS model achieves enhancement of the performance of service distribution for a variety of criteria such as service execution times, makespan, turnaround times, utilization levels and energy consumption rates for various inter-cloud entities, e.g. users, hosts and VMs. For example, ICMS optimizes the performance of a non-meta-brokering inter-cloud by 3%, while ICMS with full optimal schemes achieves 9% optimization for the same configurations. The whole experimental platform is implemented into the inter-cloud Simulation toolkit (SimIC) developed by the author, which is a discrete event simulation framework

    Scheduling and Dynamic Management of Applications over Grids

    Get PDF
    The work presented in this Thesis is about scheduling applications in computational Grids. We study how to better manage jobs in a grid middleware in order to improve the performance of the platform. Our solutions are designed to work at the middleware layer, thus allowing to keep the underlying architecture unmodified. First, we propose a reallocation mechanism to dynamically tackle errors that occur during the scheduling. Indeed, it is often necessary to provide a runtime estimation when submitting on a parallel computer so that it can compute a schedule. However, estimations are inherently inaccurate and scheduling decisions are based on incorrect data, and are therefore wrong. The reallocation mechanism we propose tackles this problem by moving waiting jobs between several parallel machines in order to reduce the scheduling errors due to inaccurate runtime estimates. Our second interest in the Thesis is the study of the scheduling of a climatology application on the Grid. To provide the best possible performances, we modeled the application as a Directed Acyclic Graph (DAG) and then proposed specific scheduling heuristics. To execute the application on the Grid, the middleware uses the knowledge of the application to find thebest schedule.Les travaux présentés dans cette thÚse portent sur l'ordonnancement d'applications au sein d'un environnement de grille de calcul. Nous étudions comment mieux gérer les tùches au sein des intergiciels de grille, ceci dans l'objectif d'améliorer les performances globales de la plateforme. Les solutions que nous proposons se situent dans l'intergiciel, ce qui permet de conserver les architectures sous-jacentes sans les modifier. Dans un premier temps, nous proposons un mécanisme de réallocation permettant de prendre en compte dynamiquement les erreurs d'ordonnancement commises lors de la soumission de calculs. En effet, lors de la soumission sur une machine parallÚle, il est souvent nécessaire de fournir une estimation du temps d'exécution afin que celle-ci puisse effectuer un ordonnancement. Cependant, les estimations ne sont pas précises et les décisions d'ordonnancement sont sans cesse remises en question. Le mécanisme de réallocation proposé permet de prendre en compte ces changements en déplaçant des calculs d'une machine parallÚle à une autre. Le second point auquel nous nous intéressons dans cette thÚse est l'ordonnancement d'une application de climatologie sur la grille. Afin de fournir les meilleures performances possibles nous avons modélisé l'application puis proposé des heuristiques spécifiques. Pour exécuter l'application sur une grille de calcul, l'intergiciel utilise ces connaissances sur l'application pour fournir le meilleur ordonnancement possible

    Workload characterization, modeling, and prediction in grid Computing

    Get PDF
    Workloads play an important role in experimental performance studies of computer systems. This thesis presents a comprehensive characterization of real workloads on production clusters and Grids. A variety of correlation structures and rich scaling behavior are identified in workload attributes such as job arrivals and run times, including pseudo-periodicity, long range dependence, and strong temporal locality. Based on the analytic results workload models are developed to fit the real data. For job arrivals three different kinds of autocorrelations are investigated. For short to middle range dependent data, Markov modulated Poisson processes (MMPP) are good models because they can capture correlations between interarrival times while remaining analytically tractable. For long range dependent and multifractal processes, the multifractal wavelet model (MWM) is able to reconstruct the scaling behavior and it provides a coherent wavelet framework for analysis and synthesis. Pseudo-periodicity is a special kind of autocorrelation and it can be modeled by a matching pursuit approach. For workload attributes such as run time a new model is proposed that can fit not only the marginal distribution but also the second order statistics such as the autocorrelation function (ACF). The development of workload models enable the simulation studies of Grid scheduling strategies. By using the synthetic traces, the performance impacts of workload correlations in Grid scheduling is quantitatively evaluated. The results indicate that autocorrelations in workload attributes can cause performance degradation, in some situations the difference can be up to several orders of magnitude. The larger the autocorrelation, the worse the performance, it is proved both at the cluster and Grid level. This study shows the importance of realistic workload models in performance evaluation studies. Regarding performance predictions, this thesis treats the targeted resources as a ``black box'' and takes a statistical approach. It is shown that statistical learning based methods, after a well-thought and fine-tuned design, are able to deliver good accuracy and performance.UBL - phd migration 201
    corecore