54,990 research outputs found
UDRF: Multi-resource Fairness for Complex Jobs with Placement Constraints
In this paper, we study the problem of multi- resource fairness in systems running complex jobs that consist of multiple interconnected tasks. A job is considered finished when all its corresponding tasks have been executed in the system. Tasks can have different resource requirements. Because of special demands on particular hardware or software, tasks may have placement constraints limiting the type of machines they can run on. We develop User-Dependence Dominant Resource Fairness (UDRF), a generalized version of max-min fairness that combines graph theory and the notion of dominant re- source shares to ensure multi-resource fairness between complex workflows. UDRF satisfies several desirable properties including strategy proofness, which ensures that users do not benefit from misreporting their true resource demands. We propose an offline algorithm that computes optimal UDRF allocation. But optimality comes at a cost, especially for systems where schedulers need to make thousands of online scheduling decisions per second. Therefore, we develop a lightweight online algorithm that closely approximates UDRF. Besides that, we propose a simple mechanism to decentralize the UDRF scheduling process across multiple schedulers. Large-scale simulations driven by Google cluster-usage traces show that UDRF achieves better resource utilization and throughput compared to the current state-of-the-art in fair resource allocation
SELFISHMIGRATE: A Scalable Algorithm for Non-clairvoyantly Scheduling Heterogeneous Processors
We consider the classical problem of minimizing the total weighted flow-time
for unrelated machines in the online \emph{non-clairvoyant} setting. In this
problem, a set of jobs arrive over time to be scheduled on a set of
machines. Each job has processing length , weight , and is
processed at a rate of when scheduled on machine . The online
scheduler knows the values of and upon arrival of the job,
but is not aware of the quantity . We present the {\em first} online
algorithm that is {\em scalable} ((1+\eps)-speed
-competitive for any constant \eps > 0) for the
total weighted flow-time objective. No non-trivial results were known for this
setting, except for the most basic case of identical machines. Our result
resolves a major open problem in online scheduling theory. Moreover, we also
show that no job needs more than a logarithmic number of migrations. We further
extend our result and give a scalable algorithm for the objective of minimizing
total weighted flow-time plus energy cost for the case of unrelated machines
and obtain a scalable algorithm. The key algorithmic idea is to let jobs
migrate selfishly until they converge to an equilibrium. Towards this end, we
define a game where each job's utility which is closely tied to the
instantaneous increase in the objective the job is responsible for, and each
machine declares a policy that assigns priorities to jobs based on when they
migrate to it, and the execution speeds. This has a spirit similar to
coordination mechanisms that attempt to achieve near optimum welfare in the
presence of selfish agents (jobs). To the best our knowledge, this is the first
work that demonstrates the usefulness of ideas from coordination mechanisms and
Nash equilibria for designing and analyzing online algorithms
Metascheduling of HPC Jobs in Day-Ahead Electricity Markets
High performance grid computing is a key enabler of large scale collaborative
computational science. With the promise of exascale computing, high performance
grid systems are expected to incur electricity bills that grow super-linearly
over time. In order to achieve cost effectiveness in these systems, it is
essential for the scheduling algorithms to exploit electricity price
variations, both in space and time, that are prevalent in the dynamic
electricity price markets. In this paper, we present a metascheduling algorithm
to optimize the placement of jobs in a compute grid which consumes electricity
from the day-ahead wholesale market. We formulate the scheduling problem as a
Minimum Cost Maximum Flow problem and leverage queue waiting time and
electricity price predictions to accurately estimate the cost of job execution
at a system. Using trace based simulation with real and synthetic workload
traces, and real electricity price data sets, we demonstrate our approach on
two currently operational grids, XSEDE and NorduGrid. Our experimental setup
collectively constitute more than 433K processors spread across 58 compute
systems in 17 geographically distributed locations. Experiments show that our
approach simultaneously optimizes the total electricity cost and the average
response time of the grid, without being unfair to users of the local batch
systems.Comment: Appears in IEEE Transactions on Parallel and Distributed System
- β¦