7 research outputs found

    Non-clairvoyant Scheduling of Multiple Bag-of-Tasks Applications

    Get PDF
    International audienceThe bag-of-tasks application model, albeit simple, arises in many application domains and has received a lot of attention in the scheduling literature. Previous works propose either theoretically sound solutions that rely on unrealistic assumptions, or ad-hoc heuristics with no guarantees on performance. This work attempts to bridge this gap through the design of non-clairvoyant heuristics based on solid theoretical foundations. The performance achieved by these heuristics is studied via simulations in a view to comparing them both to previously proposed solutions and to theoretical upper bounds on achievable performance. Also, an interesting theoretical result in this work is that a straightforward on-demand heuristic delivers asymptotically optimal performance when the communications or the computations can be neglected

    A user-centric execution environment for <em>CineGrid</em> workloads

    Get PDF
    The abundance and heterogeneity of IT resources available, together with the ability to dynamically scale applications poses significant usability issues to users. Without understanding the performance profile of available resources users are unable to efficiently scale their applications in order to meet performance objectives. High quality media collaborations, like CineGrid, are one example of such diverse environments where users can leverage dynamic infrastructures to move and process large amounts of data. This paper describes our user-centric approach to executing high quality media processing workloads over dynamic infrastructures. Our main contribution is the CGtoolkit environment, an integrated system which aids users cope with the infrastructure complexity and large data sets specific to the digital cinema domain

    Fair scheduling of bag-of-tasks applications using distributed Lagrangian optimization

    Get PDF
    International audienceLarge scale distributed systems typically comprise hundreds to millions of entities (applications, users, companies, universities) that have only a partial view of resources (computers, communication links). How to fairly and efficiently share such resources between entities in a distributed way has thus become a critical question. Although not all applications are suitable for execution on large scale distributed computing platform, ideal are the Bag-of-Tasks (BoT) applications. Hence a large fraction of jobs in workloads imposed on Grids is made of sequential applications submitted in the form of BoTs. Up until now, mainly simple mechanisms have been used to ensure a fair sharing of resources among these applications. Although these mechanisms are proved to be efficient for CPU-bound applications, they are known to be ineffective in the presence of network-bound applications. A possible answer resorts to Lagrangian optimization and distributed gradient descent. Under certain conditions, the resource sharing problem can be formulated as a global optimization problem, which can be solved by a distributed self-stabilizing supply and demand algorithm. In the last decade, this technique has been applied to design various network protocols (variants of TCP, multi-path network protocols, wireless network protocols) and even distributed algorithms for smart grids. In this article, we explain how to use this technique for fairly scheduling concurrent BoT applications with arbitrary communication-to-computation ratio on a Grid. Yet, application heterogeneity raises severe convergence and stability issues that did not appear in the previous contexts and need to be addressed by non-trivial modifications. The effectiveness of our proposal is assessed through an extensive set of complex and realistic simulations

    Resource-Constrained Scheduling of Stochastic Tasks With Unknown Probability Distribution

    Get PDF
    This work introduces scheduling strategies to maximize the expected numberof independent tasks that can be executed on a cloud platform within a given budgetand under a deadline constraint. Task execution times are not known before execution;instead, the only information available to the scheduler is that they obey some (unknown)probability distribution. The scheduler needs to acquire some information before decidingfor a cutting threshold: instead of allowing all tasks to run until completion, one maywant to interrupt long-running tasks at some point. In addition, the cutting thresholdmay be reevaluated as new information is acquired when the execution progresses further.This works presents several strategies to determine a good cutting threshold, and to decidewhen to re-evaluate it. In particular, we use the Kaplan-Meier estimator to account fortasks that are still running when making a decision. The efficiency of our strategies isassessed through an extensive set of simulations with various budget and deadline values,and ranging over 14 probability distributions.Ce travail prĂ©sente des stratĂ©gies d’ordonnancement permettant de maximiser le nombre attendu de tĂąches indĂ©pendantes pouvant ĂȘtre exĂ©cutĂ©es sur une plateforme de type cloud avec un budget donnĂ© et une contrainte de date limite. Le temps d’exĂ©cution des tĂąches est inconnu, on sait seulement qu’ils obĂ©issent Ă  une distribution de probabilitĂ© (inconnue). L’ordonnanceur peut dĂ©cider Ă  tout moment d’interrompre l’exĂ©cution d’une tĂąche (longue) en cours d’exĂ©cution et d’en lancer une nouvelle, mais le budget dĂ©jĂ  utilisĂ© pour la tĂąche interrompue est perdu. Le seuil d’interruption d’une tĂąche peut ĂȘtre recalculĂ© au fur et Ă  mesure que l’exĂ©cution progresse globalement. Ce travail prĂ©sente plusieurs stratĂ©gies pour dĂ©terminer un bon seuil d’interruption, et pour dĂ©cider quand le rĂ©-Ă©valuer. Nous utilisons l’estimateur de Kaplan-Meier pour prendre en compte les tĂąches en cours d’exĂ©cution au moment oĂč la dĂ©cision est prise. L’efficacitĂ© de nos stratĂ©gies est Ă©valuĂ©e via un vaste ensemble de simulations, avec diverses valeurs de budget et de date limite, et portant sur 14 distributions de probabilit

    A Framework for Approximate Optimization of BoT Application Deployment in Hybrid Cloud Environment

    Get PDF
    We adopt a systematic approach to investigate the efficiency of near-optimal deployment of large-scale CPU-intensive Bag-of-Task applications running on cloud resources with the non-proportional cost to performance ratios. Our analytical solutions perform in both known and unknown running time of the given application. It tries to optimize users' utility by choosing the most desirable tradeoff between the make-span and the total incurred expense. We propose a schema to provide a near-optimal deployment of BoT application regarding users' preferences. Our approach is to provide user with a set of Pareto-optimal solutions, and then she may select one of the possible scheduling points based on her internal utility function. Our framework can cope with uncertainty in the tasks' execution time using two methods, too. First, an estimation method based on a Monte Carlo sampling called AA algorithm is presented. It uses the minimum possible number of sampling to predict the average task running time. Second, assuming that we have access to some code analyzer, code profiling or estimation tools, a hybrid method to evaluate the accuracy of each estimation tool in certain interval times for improving resource allocation decision has been presented. We propose approximate deployment strategies that run on hybrid cloud. In essence, proposed strategies first determine either an estimated or an exact optimal schema based on the information provided from users' side and environmental parameters. Then, we exploit dynamic methods to assign tasks to resources to reach an optimal schema as close as possible by using two methods. A fast yet simple method based on First Fit Decreasing algorithm, and a more complex approach based on the approximation solution of the transformed problem into a subset sum problem. Extensive experiment results conducted on a hybrid cloud platform confirm that our framework can deliver a near optimal solution respecting user's utility function
    corecore