27 research outputs found
Communication-aware job placement policies for the KOALA grid scheduler
In multicluster systems, and more generally, in grids, parallel applications may require co-allocation, i.e., the simultaneous allocation of resources such as processors in multiple clusters. Although co-allocation enables the allocation of more processors than available on a single cluster, depending on the applicationsĂÂżĂÂż communication characteristics, it has the potential disadvantage of increased execution times due to relatively slow wide-area communication. In this paper, we present two job placement policies, the Cluster Minimization and the Flexible Cluster Minimization policies which take into account the wide-area communication overhead when co-allocating applications across the clusters. We have implemented these policies in our grid scheduler called KOALA in order to serve different job request types. To assess the performance of the policies, we perform experiments in a real multicluster testbed using communication-intensive parallel applications
Grid-job scheduling with reservations and preemption
Computational grids make it possible to exploit grid resources across multiple clusters when grid jobs are deconstructed into tasks and allocated across clusters. Grid-job tasks are often scheduled in the form of workflows which require synchronization, and advance reservation makes it easy to guarantee predictable resource provisioning for these jobs. However, advance reservation for grid jobs creates roadblocks and fragmentation which adversely affects the system utilization and response times for local jobs. We provide a solution which incorporates relaxed reservations and uses a modified version of the standard grid-scheduling algorithm, HEFT, to obtain flexibility in placing reservations for workflow grid jobs. Furthermore, we deploy the relaxed reservation with modified HEFT as an extension of the preemption based job scheduling framework, SCOJO-PECT job scheduler. In SCOJO-PECT, relaxed reservations serve the additional purpose of permitting scheduler optimizations which shift the overall schedule forward. Furthermore, a propagation heuristics algorithm is used to alleviate the workflow job makespan extension caused by the slack of relaxed reservation. Our solution aims at decreasing the fragmentation caused by grid jobs, so that local jobs and system utilization are not compromised, and at the same time grid jobs also have reasonable response times
Survey and Analysis of Production Distributed Computing Infrastructures
This report has two objectives. First, we describe a set of the production
distributed infrastructures currently available, so that the reader has a basic
understanding of them. This includes explaining why each infrastructure was
created and made available and how it has succeeded and failed. The set is not
complete, but we believe it is representative.
Second, we describe the infrastructures in terms of their use, which is a
combination of how they were designed to be used and how users have found ways
to use them. Applications are often designed and created with specific
infrastructures in mind, with both an appreciation of the existing capabilities
provided by those infrastructures and an anticipation of their future
capabilities. Here, the infrastructures we discuss were often designed and
created with specific applications in mind, or at least specific types of
applications. The reader should understand how the interplay between the
infrastructure providers and the users leads to such usages, which we call
usage modalities. These usage modalities are really abstractions that exist
between the infrastructures and the applications; they influence the
infrastructures by representing the applications, and they influence the ap-
plications by representing the infrastructures
Topology-Aware Job Mapping
International audienceA Resource and Job Management System (RJMS) is a crucial system software part of the HPC stack. It is responsible for eciently delivering computing power to applications in supercomputing environments. Its main intelligence relies on resource selection techniques to find the most adapted resources to schedule the users' jobs. This paper introduces a new method that takes into account the topology of the machine and the application characteristics to determine the best choice among the available nodes of the platform, based upon the network topology and taking into account the applications communication pattern. To validate our approach, we integrate this algorithm as a plugin for Slurm, a well-known and widespread RJMS. We assess our plugin with diâ”erent optimization schemes by comparing with the default topology-aware Slurm algorithm, using both emulation and simulation of a large-scale platform and by carrying out experiments in a real cluster. We show that transparently taking into account a job communication pattern and the topology allows for relevant performance gains
The Inter-cloud meta-scheduling
Inter-cloud is a recently emerging approach that expands cloud elasticity. By facilitating an adaptable setting, it purposes at the realization of a scalable resource provisioning that enables a diversity of cloud user requirements to be handled efficiently. This studyâs contribution is in the inter-cloud performance optimization of job executions using metascheduling concepts. This includes the development of the inter-cloud meta-scheduling (ICMS) framework, the ICMS optimal schemes and the SimIC toolkit. The ICMS model is an architectural strategy for managing and scheduling user services in virtualized dynamically inter-linked clouds. This is achieved by the development of a model that includes a set of algorithms, namely the Service-Request, Service-Distribution, Service-Availability and Service-Allocation algorithms. These along with resource management optimal schemes offer the novel functionalities of the ICMS where the message exchanging implements the job distributions method, the VM deployment offers the VM management features and the local resource management system details the management of the local cloud schedulers. The generated system offers great flexibility by facilitating a lightweight resource management methodology while at the same time handling the heterogeneity of different clouds through advanced service level agreement coordination. Experimental results are productive as the proposed ICMS model achieves enhancement of the performance of service distribution for a variety of criteria such as service execution times, makespan, turnaround times, utilization levels and energy consumption rates for various inter-cloud entities, e.g. users, hosts and VMs. For example, ICMS optimizes the performance of a non-meta-brokering inter-cloud by 3%, while ICMS with full optimal schemes achieves 9% optimization for the same configurations. The whole experimental platform is implemented into the inter-cloud Simulation toolkit (SimIC) developed by the author, which is a discrete event simulation framework
Energy-efficient Nature-Inspired techniques in Cloud computing datacenters
Cloud computing is a systematic delivery of computing resources as services to the consumers via the Internet. Infrastructure
as a Service (IaaS) is the capability provided to the consumer by enabling smarter access to the processing, storage,
networks, and other fundamental computing resources, where the consumer can deploy and run arbitrary software including
operating systems and applications. The resources are sometimes available in the form of Virtual Machines (VMs). Cloud
services are provided to the consumers based on the demand, and are billed accordingly. Usually, the VMs run on various
datacenters, which comprise of several computing resources consuming lots of energy resulting in hazardous level of carbon
emissions into the atmosphere. Several researchers have proposed various energy-efficient methods for reducing the energy
consumption in datacenters. One such solutions are the Nature-Inspired algorithms. Towards this end, this paper presents a
comprehensive review of the state-of-the-art Nature-Inspired algorithms suggested for solving the energy issues in the Cloud
datacenters. A taxonomy is followed focusing on three key dimension in the literature including virtualization, consolidation,
and energy-awareness. A qualitative review of each techniques is carried out considering key goal, method, advantages, and
limitations. The Nature-Inspired algorithms are compared based on their features to indicate their utilization of resources
and their level of energy-efficiency. Finally, potential research directions are identified in energy optimization in data centers.
This review enable the researchers and professionals in Cloud computing datacenters in understanding literature evolution
towards to exploring better energy-efficient methods for Cloud computing datacenters
Energy-efficient resource allocation scheme based on enhanced flower pollination algorithm for cloud computing data center
Cloud Computing (CC) has rapidly emerged as a successful paradigm for providing ICT infrastructure. Efficient and environmental-friendly resource allocation mechanisms, responsible for allocatinpg Cloud data center resources to execute user applications in the form of requests are undoubtedly required. One of the promising Nature-Inspired techniques for addressing virtualization, consolidation and energyaware problems is the Flower Pollination Algorithm (FPA). However, FPA suffers from entrapment and its static control parameters cannot maintain a balance between local and global search which could also lead to high energy consumption and inadequate resource utilization. This research developed an enhanced FPA-based energy efficient resource allocation scheme for Cloud data center which provides efficient resource utilization and energy efficiency with less probable Service Level Agreement (SLA) violations. Firstly, an Enhanced Flower Pollination Algorithm for Energy-Efficient Virtual Machine Placement (EFPA-EEVMP) was developed. In this algorithm, a Dynamic Switching Probability (DSP) strategy was adopted to balance the local and global search space in FPA used to minimize the energy consumption and maximize resource utilization. Secondly, Multi-Objective Hybrid Flower Pollination Resource Consolidation (MOH-FPRC) algorithm was developed. In this algorithm, Local Neighborhood Search (LNS) and Pareto optimisation strategies were combined with Clustering algorithm to avoid local trapping and address Cloud service providers conflicting objectives such as energy consumption and SLA violation. Lastly, Energy-Aware Multi-Cloud Flower Pollination Optimization (EAM-FPO) scheme was developed for distributed Multi-Cloud data center environment. In this scheme, Power Usage Effectiveness (PUE) and migration controller were utilised to obtain the optimal solution in a larger search space of the CC environment. The scheme was tested on MultiRecCloudSim simulator. Results of the simulation were compared with OEMACS, ACS-VMC, and EA-DP. The scheme produced outstanding performance improvement rate on the data center energy consumption by 20.5%, resource utilization by 23.9%, and SLA violation by 13.5%. The combined algorithms have reduced entrapment and maintaned balance between local and global search. Therefore, based on the findings the developed scheme has proven to be efficient in minimizing energy consumption while at the same time improving the data center resource allocation with minimum SLA violation
An interoperable and self-adaptive approach for SLA-based service virtualization in heterogeneous Cloud environments
Cloud computing is a newly emerged computing infrastructure that builds on the latest achievements of diverse research areas, such as Grid computing, Service-oriented computing, business process management and virtualization. An important characteristic of Cloud-based services is the provision of non-functional guarantees in the form of Service Level Agreements (SLAs), such as guarantees on execution time or price. However, due to system malfunctions, changing workload conditions, hard- and software failures, established SLAs can be violated. In order to avoid costly SLA violations, flexible and adaptive SLA attainment strategies are needed. In this paper we present a self-manageable architecture for SLA-based service virtualization that provides a way to ease interoperable service executions in a diverse, heterogeneous, distributed and virtualized world of services. We demonstrate in this paper that the combination of negotiation, brokering and deployment using SLA-aware extensions and autonomic computing principles are required for achieving reliable and efficient service operation in distributed environments. © 2012 Elsevier B.V. All rights reserved
Methodology for malleable applications on distributed memory systems
A la portada logo BSC(English) The dominant programming approach for scientific and industrial computing on clusters is MPI+X. While there are a variety of approaches within the node, denoted by the ``X'', Message Passing interface (MPI) is the standard for programming multiple nodes with distributed memory. This thesis argues that the OmpSs-2 tasking model can be extended beyond the node to naturally support distributed memory, with three benefits:
First, at small to medium scale the tasking model is a simpler and more productive alternative to MPI. It eliminates the need to distribute the data explicitly and convert all dependencies into explicit message passing. It also avoids the complexity of hybrid programming using MPI+X.
Second, the ability to offload parts of the computation among the nodes enables the runtime to automatically balance the loads in a full-scale MPI+X program. This approach does not require a cost model, and it is able to transparently balance the computational loads across the whole program, on all its nodes.
Third, because the runtime handles all low-level aspects of data distribution and communication, it can change the resource allocation dynamically, in a way that is transparent to the application.
This thesis describes the design, development and evaluation of OmpSs-2@Cluster, a programming model and runtime system that extends the OmpSs-2 model to allow a virtually unmodified OmpSs-2 program to run across multiple distributed memory nodes. For well-balanced applications it provides similar performance to MPI+OpenMP on up to 16 nodes, and it improves performance by up to 2x for irregular and unbalanced applications like Cholesky factorization.
This work also extended OmpSs-2@Cluster for interoperability with MPI and Barcelona Supercomputing Center (BSC)'s state-of-the-art Dynamic Load Balance (DLB) library in order to dynamically balance MPI+OmpSs-2 applications by transparently offloading tasks among nodes. This approach reduces the execution time of a microscale solid mechanics application by 46% on 64 nodes and on a synthetic benchmark, it is within 10% of perfect load balancing on up to 8 nodes.
Finally, the runtime was extended to transparently support malleability for pure OmpSs-2@Cluster programs and interoperate with the Resources Management System (RMS). The only change to the application is to explicitly call an API function to control the addition or removal of nodes. In this regard we additionally provide the runtime with the ability to semi-transparently save and recover part of the application status to perform checkpoint and restart. Such a feature hides the complexity of
data redistribution and parallel IO from the user while allowing the program to recover and continue previous executions. Our work is a starting point for future research on fault tolerance.
In summary, OmpSs-2@Cluster expands the OmpSs-2 programming model to encompass distributed memory clusters. It allows an existing OmpSs-2 program, with few if any changes, to run across multiple nodes. OmpSs-2@Cluster supports transparent multi-node dynamic load balancing for MPI+OmpSs-2 programs, and enables semi-transparent malleability for OmpSs-2@Cluster programs. The runtime system has a high level of stability and performance, and it opens several avenues for future work.(Español) El modelo de programaciĂłn dominante para clusters tanto en ciencia como industria es actualmente MPI+X. A pesar de que hay alguna variedad de alternativas para programar dentro de un nodo (indicado por la "X"), el estandar para programar mĂșltiples nodos con memoria distribuida sigue siendo Message Passing Interface (MPI). Esta tesis propone la extensiĂłn del modelo de programaciĂłn basado en tareas OmpSs-2 para su funcionamiento en sistemas de memoria distribuida, destacando 3 beneficios principales: En primer lugar; a pequeña y mediana escala, un modelo basado en tareas es mĂĄs simple y productivo que MPI y elimina la necesidad de distribuir los datos explĂcitamente y convertir todas las dependencias en mensajes. AdemĂĄs, evita la complejidad de la programacion hĂbrida MPI+X. En segundo lugar; la capacidad de enviar partes del cĂĄlculo entre los nodos permite a la librerĂa balancear la carga de trabajo en programas MPI+X a gran escala. Este enfoque no necesita un modelo de coste y permite equilibrar cargas transversalmente en todo el programa y todos los nodos. En tercer lugar; teniendo en cuenta que es la librerĂa quien maneja todos los aspectos relacionados con distribuciĂłn y transferencia de datos, es posible la modificaciĂłn dinĂĄmica y transparente de los recursos que utiliza la aplicaciĂłn. Esta tesis describe el diseño, desarrollo y evaluaciĂłn de OmpSs-2@Cluster; un modelo de programaciĂłn y librerĂa que extiende OmpSs-2 permitiendo la ejecuciĂłn de programas OmpSs-2 existentes en mĂșltiples nodos sin prĂĄcticamente necesidad de modificarlos. Para aplicaciones balanceadas, este modelo proporciona un rendimiento similar a MPI+OpenMP hasta 16 nodos y duplica el rendimiento en aplicaciones irregulares o desbalanceadas como la factorizaciĂłn de Cholesky. Este trabajo incluye la extensiĂłn de OmpSs-2@Cluster para interactuar con MPI y la librerĂa de balanceo de carga Dynamic Load Balancing (DLB) desarrollada en el Barcelona Supercomputing Center (BSC). De este modo es posible equilibrar aplicaciones MPI+OmpSs-2 mediante la transferencia transparente de tareas entre nodos. Este enfoque reduce el tiempo de ejecuciĂłn de una aplicaciĂłn de mecĂĄnica de sĂłlidos a micro-escala en un 46% en 64 nodos; en algunos experimentos hasta 8 nodos se pudo equilibrar perfectamente la carga con una diferencia inferior al 10% del equilibrio perfecto. Finalmente, se implementĂł otra extensiĂłn de la librerĂa para realizar operaciones de maleabilidad en programas OmpSs-2@Cluster e interactuar con el Sistema de Manejo de Recursos (RMS). El Ășnico cambio requerido en la aplicaciĂłn es la llamada explicita a una funciĂłn de la interfaz que controla la adiciĂłn o eliminaciĂłn de nodos. AdemĂĄs, se agregĂł la funcionalidad de guardar y recuperar parte del estado de la aplicaciĂłn de forma semitransparente con el objetivo de realizar operaciones de salva-reinicio. Dicha funcionalidad oculta al usuario la complejidad de la redistribuciĂłn de datos y las operaciones de lectura-escritura en paralelo, mientras permite al programa recuperar y continuar ejecuciones previas. Este es un punto de partida para futuras investigaciones en tolerancia a fallos. En resumen, OmpSs-2@Cluster amplĂa el modelo de programaciĂłn de OmpSs-2 para abarcar sistemas de memoria distribuida. El modelo permite la ejecuciĂłn de programas OmpSs-2 en mĂșltiples nodos prĂĄcticamente sin necesidad de modificarlos. OmpSs-2@Cluster permite ademĂĄs el balanceo dinĂĄmico de carga en aplicaciones hĂbridas MPI+OmpSs-2 ejecutadas en varios nodos y es capaz de realizar maleabilidad semi-transparente en programas OmpSs-2@Cluster puros. La librerĂa tiene un niveles de rendimiento y estabilidad altos y abre varios caminos para trabajos futuro.Arquitectura de computador