42 research outputs found
Evaluation of Reallocation Heuristics for Moldable Tasks in Computational Dedicated and non Dedicated Grids
Grid services often consist of remote sequential or rigid parallel application executions. However, moldable parallel applications, linear algebra solvers for example, are of great interest but requires dynamic tuning which has mostly to be done interactively if performances are needed. Thus, their grid execution depends on a remote and transparent submission to a possibly different batch scheduler on each site, and means an automatic tuning of the job according to the local load. In this report we study the benefits of having a middleware able to automatically submit and reallocate requests from one site to another when it is also able to configure the services by tuning their number of processors and their walltime. In this context, we evaluate the benefits of such mechanisms on four multi-cluster Grid setups, where the platform is either composed of several heterogeneous or homogeneous, dedicated or non dedicated clusters. Different scenarios are explored using simulations of real cluster traces from different origins. Results show that a simple scheduling heuristic is good and often the best. Indeed, it is faster and thus can take more jobs into account while having a small execution time. Moreover, users can expect more jobs finishing sooner and a gain on the average job response time between 10\% and 40\% in most cases if this reallocation mechanism combined to auto-tuning capabilities is implemented in a Grid framework. The implementation and the maintenance of this heuristic coupled to the migration mechanism in a Grid middleware is also simpler because less transfers are involved.L'appel à des services présents sur les grilles de calcul correspondent généralement à l'exécution d'une application séquentielle ou rigide. Cependant, il est possible d'avoir des applications parallèles moldables, telles que des solveurs linéaires, qui sont d'un grand intérêt, mais qui demandent une adaptation dynamique pour obtenir de bonnes performances. Leur exécution nécessite donc d'avoir un accès distant et transparent à différents gestionnaires de ressources, demandant donc une adaptation automatique de l'application en fonction de la charge locale. Dans ce rapport, nous étudions les bénéfices découlant de l'utilisation d'un intergiciel de grille capable de soumettre et de réallouer des requêtes d'un site à l'autre tout en configurant automatiquement les services en choisissant le nombre de processeurs ainsi que la durée d'exécution estimée. Dans ce contexte, nous évaluons les gains apportés par de tels mécanismes sur quatre grilles de calcul différentes où la plate-forme est composée de plusieurs grappes, homogène ou hétérogènes, dédiées ou non. Nous explorons différents scénarios par la simulation de traces de tâches provenant de réelles exécutions. Les résultats montrent que l'utilisation d'une heuristique d'ordonnancement simple est efficace, souvent amplement suffisante, voire la meilleure. En effet, elle est plus rapide à l'exécution et permet de prendre plus de requêtes en compte. Les utilisateurs peuvent espérer une majorité de requêtes terminant plus tôt si elle est utilisée, ainsi qu'une réduction du temps d'attente du résultat d'entre 10\% et 40\% dans la plupart des cas lorsque le mécanisme de réallocation couplé à l'adaptation automatique sont présents dans l'intergiciel. De plus, l'implantation et la maintenance de cette heuristique couplée au mécanisme de migration de tâches dans un intergiciel de grille est aussi plus facile car moins de tranferts sont nécessaires
Reliable Provisioning of Spot Instances for Compute-intensive Applications
Cloud computing providers are now offering their unused resources for leasing
in the spot market, which has been considered the first step towards a
full-fledged market economy for computational resources. Spot instances are
virtual machines (VMs) available at lower prices than their standard on-demand
counterparts. These VMs will run for as long as the current price is lower than
the maximum bid price users are willing to pay per hour. Spot instances have
been increasingly used for executing compute-intensive applications. In spite
of an apparent economical advantage, due to an intermittent nature of biddable
resources, application execution times may be prolonged or they may not finish
at all. This paper proposes a resource allocation strategy that addresses the
problem of running compute-intensive jobs on a pool of intermittent virtual
machines, while also aiming to run applications in a fast and economical way.
To mitigate potential unavailability periods, a multifaceted fault-aware
resource provisioning policy is proposed. Our solution employs price and
runtime estimation mechanisms, as well as three fault tolerance techniques,
namely checkpointing, task duplication and migration. We evaluate our
strategies using trace-driven simulations, which take as input real price
variation traces, as well as an application trace from the Parallel Workload
Archive. Our results demonstrate the effectiveness of executing applications on
spot instances, respecting QoS constraints, despite occasional failures.Comment: 8 pages, 4 figure
"Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"
This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). The objective was to obtain the best performance in response time in multiprogrammed multiprocessors systems using batch systems, assuming all the jobs have the same priority. To achieve that purpose, the benefits of supporting malleability on MPI jobs to reduce fragmentation and consequently improve the performance of the system were studied. The contributions made in this work can be summarized as follows:· Virtual malleability: A mechanism where a job is assigned a dynamic processor partition, where the number of processes is greater than the number of processors. The partition size is modified at runtime, according to external requirements such as the load of the system, by varying the multiprogramming level, making the job contend for resources with itself. In addition to this, a mechanism which decides at runtime if applying local or global process queues to an application depending on the load balancing between processes of it. · A job scheduling policy, that takes decisions such as how many processes to start with and the maximum multiprogramming degree based on the type and number of applications running and queued. Moreover, as soon as a job finishes execution and where there are queued jobs, this algorithm analyzes whether it is better to start execution of another job immediately or just wait until there are more resources available. · A new alternative to backfilling strategies for the problema of window execution time expiring. Virtual malleability is applied to the backfilled job, reducing its partition size but without aborting or suspending it as in traditional backfilling. The evaluation of this thesis has been done using a practical approach. All the proposals were implemented, modifying the three scheduling levels: queuing system, processor scheduler and runtime library. The impact of the contributions were studied under several types of workloads, varying machine utilization, communication and, balance degree of the applications, multiprogramming level, and job size. Results showed that it is possible to offer malleability over MPI jobs. An application obtained better performance when contending for the resources with itself than with other applications, especially in workloads with high machine utilization. Load imbalance was taken into account obtaining better performance if applying the right queue type to each application independently.The job scheduling policy proposed exploited virtual malleability by choosing at the beginning of execution some parameters like the number of processes and maximum multiprogramming level. It performed well under bursty workloads with low to medium machine utilizations. However as the load increases, virtual malleability was not enough. That is because, when the machine is heavily loaded, the jobs, once shrunk are not able to expand, so they must be executed all the time with a partition smaller than the job size, thus degrading performance. Thus, at this point the job scheduling policy concentrated just in moldability.Fragmentation was alleviated also by applying backfilling techniques to the job scheduling algorithm. Virtual malleability showed to be an interesting improvement in the window expiring problem. Backfilled jobs even on a smaller partition, can continue execution reducing memory swapping generated by aborts/suspensions In this way the queueing system is prevented from reinserting the backfilled job in the queue and re-executing it in the future.Postprint (published version
ReSHAPE: A Framework for Dynamic Resizing and Scheduling of Homogeneous Applications in a Parallel Environment
Applications in science and engineering often require huge computational
resources for solving problems within a reasonable time frame. Parallel
supercomputers provide the computational infrastructure for solving such
problems. A traditional application scheduler running on a parallel cluster
only supports static scheduling where the number of processors allocated to an
application remains fixed throughout the lifetime of execution of the job. Due
to the unpredictability in job arrival times and varying resource requirements,
static scheduling can result in idle system resources thereby decreasing the
overall system throughput. In this paper we present a prototype framework
called ReSHAPE, which supports dynamic resizing of parallel MPI applications
executed on distributed memory platforms. The framework includes a scheduler
that supports resizing of applications, an API to enable applications to
interact with the scheduler, and a library that makes resizing viable.
Applications executed using the ReSHAPE scheduler framework can expand to take
advantage of additional free processors or can shrink to accommodate a high
priority application, without getting suspended. In our research, we have
mainly focused on structured applications that have two-dimensional data arrays
distributed across a two-dimensional processor grid. The resize library
includes algorithms for processor selection and processor mapping. Experimental
results show that the ReSHAPE framework can improve individual job turn-around
time and overall system throughput.Comment: 15 pages, 10 figures, 5 tables Submitted to International Conference
on Parallel Processing (ICPP'07
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Shared resource interference is observed by applications as dynamic
performance asymmetry. Prior art has developed approaches to reduce the impact
of performance asymmetry mainly at the operating system and architectural
levels. In this work, we study how application-level scheduling techniques can
leverage moldability (i.e. flexibility to work as either single-threaded or
multithreaded task) and explicit knowledge on task criticality to handle
scenarios in which system performance is not only unknown but also changing
over time. Our proposed task scheduler dynamically learns the performance
characteristics of the underlying platform and uses this knowledge to devise
better schedules aware of dynamic performance asymmetry, hence reducing the
impact of interference. Our evaluation shows that both criticality-aware
scheduling and parallelism tuning are effective schemes to address interference
in both shared and distributed memory applicationsComment: Published in ICPP Workshops '2
ERASE: Energy Efficient Task Mapping and Resource Management for Work Stealing Runtimes
Parallel applications often rely on work stealing schedulers in combination with fine-grained tasking to achieve high performance and scalability. However, reducing the total energy consumption in the context of work stealing runtimes is still challenging, particularly when using asymmetric architectures with different types of CPU cores. A common approach for energy savings involves dynamic voltage and frequency scaling (DVFS) wherein throttling is carried out based on factors like task parallelism, stealing relations, and task criticality. This article makes the following observations: (i) leveraging DVFS on a per-task basis is impractical when using fine-grained tasking and in environments with cluster/chip-level DVFS; (ii) task moldability, wherein a single task can execute on multiple threads/cores via work-sharing, can help to reduce energy consumption; and (iii) mismatch between tasks and assigned resources (i.e., core type and number of cores) can detrimentally impact energy consumption. In this article, we propose EneRgy Aware SchedulEr (ERASE), an intra-application task scheduler on top of work stealing runtimes that aims to reduce the total energy consumption of parallel applications. It achieves energy savings by guiding scheduling decisions based on per-task energy consumption predictions of different resource configurations. In addition, ERASE is capable of adapting to both given static frequency settings and externally controlled DVFS. Overall, ERASE achieves up to 31% energy savings and improves performance by 44% on average, compared to the state-of-the-art DVFS-based schedulers
Optimisation of LHCb Applications for Multi- and Manycore Job Submission
Nowadays, the Worldwide LHC Computing Grid mainly consists of multi- and manycore processors. The thesis investigates how such resources can be used more efficiently at the example of the LHCb experiment. It analyses how to improve software in terms of memory requirements and concurrency. The research involves the implementation of a moldable job scheduler and a supervised learning algorithm which helps to better predict LHCb workloads
"Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"
This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). The objective was to obtain the best performance in response time in multiprogrammed multiprocessors systems using batch systems, assuming all the jobs have the same priority. To achieve that purpose, the benefits of supporting malleability on MPI jobs to reduce fragmentation and consequently improve the performance of the system were studied. The contributions made in this work can be summarized as follows:· Virtual malleability: A mechanism where a job is assigned a dynamic processor partition, where the number of processes is greater than the number of processors. The partition size is modified at runtime, according to external requirements such as the load of the system, by varying the multiprogramming level, making the job contend for resources with itself. In addition to this, a mechanism which decides at runtime if applying local or global process queues to an application depending on the load balancing between processes of it. · A job scheduling policy, that takes decisions such as how many processes to start with and the maximum multiprogramming degree based on the type and number of applications running and queued. Moreover, as soon as a job finishes execution and where there are queued jobs, this algorithm analyzes whether it is better to start execution of another job immediately or just wait until there are more resources available. · A new alternative to backfilling strategies for the problema of window execution time expiring. Virtual malleability is applied to the backfilled job, reducing its partition size but without aborting or suspending it as in traditional backfilling. The evaluation of this thesis has been done using a practical approach. All the proposals were implemented, modifying the three scheduling levels: queuing system, processor scheduler and runtime library. The impact of the contributions were studied under several types of workloads, varying machine utilization, communication and, balance degree of the applications, multiprogramming level, and job size. Results showed that it is possible to offer malleability over MPI jobs. An application obtained better performance when contending for the resources with itself than with other applications, especially in workloads with high machine utilization. Load imbalance was taken into account obtaining better performance if applying the right queue type to each application independently.The job scheduling policy proposed exploited virtual malleability by choosing at the beginning of execution some parameters like the number of processes and maximum multiprogramming level. It performed well under bursty workloads with low to medium machine utilizations. However as the load increases, virtual malleability was not enough. That is because, when the machine is heavily loaded, the jobs, once shrunk are not able to expand, so they must be executed all the time with a partition smaller than the job size, thus degrading performance. Thus, at this point the job scheduling policy concentrated just in moldability.Fragmentation was alleviated also by applying backfilling techniques to the job scheduling algorithm. Virtual malleability showed to be an interesting improvement in the window expiring problem. Backfilled jobs even on a smaller partition, can continue execution reducing memory swapping generated by aborts/suspensions In this way the queueing system is prevented from reinserting the backfilled job in the queue and re-executing it in the future
Holistic Slowdown Driven Scheduling and Resource Management for Malleable Jobs
In job scheduling, the concept of malleability has been explored since many
years ago. Research shows that malleability improves system performance, but
its utilization in HPC never became widespread. The causes are the difficulty
in developing malleable applications, and the lack of support and integration
of the different layers of the HPC software stack. However, in the last years,
malleability in job scheduling is becoming more critical because of the
increasing complexity of hardware and workloads. In this context, using nodes
in an exclusive mode is not always the most efficient solution as in
traditional HPC jobs, where applications were highly tuned for static
allocations, but offering zero flexibility to dynamic executions. This paper
proposes a new holistic, dynamic job scheduling policy, Slowdown Driven
(SD-Policy), which exploits the malleability of applications as the key
technology to reduce the average slowdown and response time of jobs. SD-Policy
is based on backfill and node sharing. It applies malleability to running jobs
to make room for jobs that will run with a reduced set of resources, only when
the estimated slowdown improves over the static approach. We implemented
SD-Policy in SLURM and evaluated it in a real production environment, and with
a simulator using workloads of up to 198K jobs. Results show better resource
utilization with the reduction of makespan, response time, slowdown, and energy
consumption, up to respectively 7%, 50%, 70%, and 6%, for the evaluated
workloads
Provisioning Spot Market Cloud Resources to Create Cost-effective Virtual Clusters
Infrastructure-as-a-Service providers are offering their unused resources in
the form of variable-priced virtual machines (VMs), known as "spot instances",
at prices significantly lower than their standard fixed-priced resources. To
lease spot instances, users specify a maximum price they are willing to pay per
hour and VMs will run only when the current price is lower than the user's bid.
This paper proposes a resource allocation policy that addresses the problem of
running deadline-constrained compute-intensive jobs on a pool of composed
solely of spot instances, while exploiting variations in price and performance
to run applications in a fast and economical way. Our policy relies on job
runtime estimations to decide what are the best types of VMs to run each job
and when jobs should run. Several estimation methods are evaluated and
compared, using trace-based simulations, which take real price variation traces
obtained from Amazon Web Services as input, as well as an application trace
from the Parallel Workload Archive. Results demonstrate the effectiveness of
running computational jobs on spot instances, at a fraction (up to 60% lower)
of the price that would normally cost on fixed priced resources.Comment: 14 pages, 4 figures, 11th International Conference on Algorithms and
Architectures for Parallel Processing (ICA3PP-11); Lecture Notes in Computer
Science, Vol. 7016, 201