9 research outputs found
Budget Constrained Execution of Multiple Bag-of-Tasks Applications on the Cloud
Optimising the execution of Bag-of-Tasks (BoT) applications on the cloud is a
hard problem due to the trade- offs between performance and monetary cost. The
problem can be further complicated when multiple BoT applications need to be
executed. In this paper, we propose and implement a heuristic algorithm that
schedules tasks of multiple applications onto different cloud virtual machines
in order to maximise performance while satisfying a given budget constraint.
Current approaches are limited in task scheduling since they place a limit on
the number of cloud resources that can be employed by the applications.
However, in the proposed algorithm there are no such limits, and in comparison
with other approaches, the algorithm on average achieves an improved
performance of 10%. The experimental results also highlight that the algorithm
yields consistent performance even with low budget constraints which cannot be
achieved by competing approaches.Comment: 8th IEEE International Conference on Cloud Computing (CLOUD 2015
Task Scheduling on the Cloud with Hard Constraints
Scheduling Bag-of-Tasks (BoT) applications on the cloud can be more
challenging than grid and cluster environ- ments. This is because a user may
have a budgetary constraint or a deadline for executing the BoT application in
order to keep the overall execution costs low. The research in this paper is
motivated to investigate task scheduling on the cloud, given two hard
constraints based on a user-defined budget and a deadline. A heuristic
algorithm is proposed and implemented to satisfy the hard constraints for
executing the BoT application in a cost effective manner. The proposed
algorithm is evaluated using four scenarios that are based on the trade-off
between performance and the cost of using different cloud resource types. The
experimental evaluation confirms the feasibility of the algorithm in satisfying
the constraints. The key observation is that multiple resource types can be a
better alternative to using a single type of resource.Comment: Visionary Track of the IEEE 11th World Congress on Services (IEEE
SERVICES 2015
Iterative-improvement-based heuristics for adaptive scheduling of tasks sharing files on heterogeneous master-slave environments
The scheduling of independent but file-sharing tasks on heterogeneous master-slave platforms has recently found important applications in Grid environments. The scheduling heuristics recently proposed for this problem are all constructive in nature and based on a common greedy criterion which depends on the momentary completion time values of the tasks. We show that this greedy decision criterion has shortcomings in exploiting the file-sharing interaction among tasks since completion time values are inadequate to extract the global view of this interaction. We propose a three-phase scheduling approach which involves initial task assignment, refinement, and execution ordering phases. For the refinement phase, we model the target application as a hypergraph and, with an elegant hypergraph-partitioning-like formulation, we propose using iterative-improvement-based heuristics for refining the task assignments according to two novel objective functions. Unlike the turnaround time, which is the actual schedule cost, the smoothness of proposed objective functions enables the use of iterative-improvement-based heuristics successfully since their effectiveness and efficiency depend on the smoothness of the objective function. Experimental results on a wide range of synthetically generated heterogeneous master-slave frameworks show that the proposed three-phase scheduling approach performs much better than the greedy constructive approach
Iterative-improvement-based heuristics for adaptive scheduling of tasks sharing files on heterogeneous master-slave environments
Cataloged from PDF version of article.Kaya, KamerM.S
Recommended from our members
Development and application of the dynamic system doctor to nuclear reactor probabilistic risk assessments.
This LDRD project has produced a tool that makes probabilistic risk assessments (PRAs) of nuclear reactors - analyses which are very resource intensive - more efficient. PRAs of nuclear reactors are being increasingly relied on by the United States Nuclear Regulatory Commission (U.S.N.R.C.) for licensing decisions for current and advanced reactors. Yet, PRAs are produced much as they were 20 years ago. The work here applied a modern systems analysis technique to the accident progression analysis portion of the PRA; the technique was a system-independent multi-task computer driver routine. Initially, the objective of the work was to fuse the accident progression event tree (APET) portion of a PRA to the dynamic system doctor (DSD) created by Ohio State University. Instead, during the initial efforts, it was found that the DSD could be linked directly to a detailed accident progression phenomenological simulation code - the type on which APET construction and analysis relies, albeit indirectly - and thereby directly create and analyze the APET. The expanded DSD computational architecture and infrastructure that was created during this effort is called ADAPT (Analysis of Dynamic Accident Progression Trees). ADAPT is a system software infrastructure that supports execution and analysis of multiple dynamic event-tree simulations on distributed environments. A simulator abstraction layer was developed, and a generic driver was implemented for executing simulators on a distributed environment. As a demonstration of the use of the methodological tool, ADAPT was applied to quantify the likelihood of competing accident progression pathways occurring for a particular accident scenario in a particular reactor type using MELCOR, an integrated severe accident analysis code developed at Sandia. (ADAPT was intentionally created with flexibility, however, and is not limited to interacting with only one code. With minor coding changes to input files, ADAPT can be linked to other such codes.) The results of this demonstration indicate that the approach can significantly reduce the resources required for Level 2 PRAs. From the phenomenological viewpoint, ADAPT can also treat the associated epistemic and aleatory uncertainties. This methodology can also be used for analyses of other complex systems. Any complex system can be analyzed using ADAPT if the workings of that system can be displayed as an event tree, there is a computer code that simulates how those events could progress, and that simulator code has switches to turn on and off system events, phenomena, etc. Using and applying ADAPT to particular problems is not human independent. While the human resources for the creation and analysis of the accident progression are significantly decreased, knowledgeable analysts are still necessary for a given project to apply ADAPT successfully. This research and development effort has met its original goals and then exceeded them
Résolution triangulaire de systèmes linéaires creux de grande taille dans un contexte parallèle multifrontal et hors-mémoire
Nous nous intéressons à la résolution de systèmes linéaires creux de très grande taille par des méthodes directes de factorisation. Dans ce contexte, la taille de la matrice des facteurs constitue un des facteurs limitants principaux pour l'utilisation de méthodes directes de résolution. Nous supposons donc que la matrice des facteurs est de trop grande taille pour être rangée dans la mémoire principale du multiprocesseur et qu'elle a donc été écrite sur les disques locaux (hors-mémoire : OOC) d'une machine multiprocesseurs durant l'étape de factorisation. Nous nous intéressons à l'étude et au développement de techniques efficaces pour la phase de résolution après une factorization multifrontale creuse. La phase de résolution, souvent négligée dans les travaux sur les méthodes directes de résolution directe creuse, constitue alors un point critique de la performance de nombreuses applications scientifiques, souvent même plus critique que l'étape de factorisation. Cette thèse se compose de deux parties. Dans la première partie nous nous proposons des algorithmes pour améliorer la performance de la résolution hors-mémoire. Dans la deuxième partie nous pousuivons ce travail en montrant comment exploiter la nature creuse des seconds membres pour réduire le volume de données accédées en mémoire. Dans la première partie de cette thèse nous introduisons deux approches de lecture des données sur le disque dur. Nous montrons ensuite que dans un environnement parallèle le séquencement des tâches peut fortement influencer la performance. Nous prouvons qu'un ordonnancement contraint des tâches peut être introduit; qu'il n'introduit pas d'interblocage entre processus et qu'il permet d'améliorer les performances. Nous conduisons nos expériences sur des problèmes industriels de grande taille (plus de 8 Millions d'inconnues) et utilisons une version hors-mémoire d'un code multifrontal creux appelé MUMPS (solveur multifrontal parallèle). Dans la deuxième partie de ce travail nous nous intéressons au cas de seconds membres creux multiples. Ce problème apparaît dans des applications en electromagnétisme et en assimilation de données et résulte du besoin de calculer l'espace propre d'une matrice fortement déficiente, du calcul d'éléments de l'inverse de la matrice associée aux équations normales pour les moindres carrés linéaires ou encore du traitement de matrices fortement réductibles en programmation linéaire. Nous décrivons un algorithme efficace de réduction du volume d'Entrées/Sorties sur le disque lors d'une résolution hors-mémoire. Plus généralement nous montrons comment le caractère creux des seconds -membres peut être exploité pour réduire le nombre d'opérations et le nombre d'accès à la mémoire lors de l'étape de résolution. Le travail présenté dans cette thèse a été partiellement financé par le projet SOLSTICE de l'ANR (ANR-06-CIS6-010). ABSTRACT : We consider the solution of very large systems of linear equations with direct multifrontal methods. In this context the size of the factors is an important limitation for the use of sparse direct solvers. We will thus assume that the factors have been written on the local disks of our target multiprocessor machine during parallel factorization. Our main focus is the study and the design of efficient approaches for the forward and backward substitution phases after a sparse multifrontal factorization. These phases involve sparse triangular solution and have often been neglected in previous works on sparse direct factorization. In many applications, however, the time for the solution can be the main bottleneck for the performance. This thesis consists of two parts. The focus of the first part is on optimizing the out-of-core performance of the solution phase. The focus of the second part is to further improve the performance by exploiting the sparsity of the right-hand side vectors. In the first part, we describe and compare two approaches to access data from the hard disk. We then show that in a parallel environment the task scheduling can strongly influence the performance. We prove that a constraint ordering of the tasks is possible; it does not introduce any deadlock and it improves the performance. Experiments on large real test problems (more than 8 million unknowns) using an out-of-core version of a sparse multifrontal code called MUMPS (MUltifrontal Massively Parallel Solver) are used to analyse the behaviour of our algorithms. In the second part, we are interested in applications with sparse multiple right-hand sides, particularly those with single nonzero entries. The motivating applications arise in electromagnetism and data assimilation. In such applications, we need either to compute the null space of a highly rank deficient matrix or to compute entries in the inverse of a matrix associated with the normal equations of linear least-squares problems. We cast both of these problems as linear systems with multiple right-hand side vectors, each containing a single nonzero entry. We describe, implement and comment on efficient algorithms to reduce the input-output cost during an outof- core execution. We show how the sparsity of the right-hand side can be exploited to limit both the number of operations and the amount of data accessed. The work presented in this thesis has been partially supported by SOLSTICE ANR project (ANR-06-CIS6-010)