19 research outputs found
Asymptotically Optimal Approximation Algorithms for Coflow Scheduling
Many modern datacenter applications involve large-scale computations composed
of multiple data flows that need to be completed over a shared set of
distributed resources. Such a computation completes when all of its flows
complete. A useful abstraction for modeling such scenarios is a {\em coflow},
which is a collection of flows (e.g., tasks, packets, data transmissions) that
all share the same performance goal.
In this paper, we present the first approximation algorithms for scheduling
coflows over general network topologies with the objective of minimizing total
weighted completion time. We consider two different models for coflows based on
the nature of individual flows: circuits, and packets. We design
constant-factor polynomial-time approximation algorithms for scheduling
packet-based coflows with or without given flow paths, and circuit-based
coflows with given flow paths. Furthermore, we give an -approximation polynomial time algorithm for scheduling circuit-based
coflows where flow paths are not given (here is the number of network
edges).
We obtain our results by developing a general framework for coflow schedules,
based on interval-indexed linear programs, which may extend to other coflow
models and objective functions and may also yield improved approximation bounds
for specific network scenarios. We also present an experimental evaluation of
our approach for circuit-based coflows that show a performance improvement of
at least 22% on average over competing heuristics.Comment: Fixed minor typo
Integrality Gap of Time-Indexed Linear Programming Relaxation for Coflow Scheduling
Coflow is a set of related parallel data flows in a network. The goal of the coflow scheduling is to process all the demands of the given coflows while minimizing the weighted completion time. It is known that the coflow scheduling problem admits several polynomial-time 5-approximation algorithms that compute solutions by rounding linear programming (LP) relaxations of the problem. In this paper, we investigate the time-indexed LP relaxation for coflow scheduling. We show that the integrality gap of the time-indexed LP relaxation is at most 4. We also show that yet another polynomial-time 5-approximation algorithm can be obtained by rounding the solutions to the time-indexed LP relaxation
Matroid Coflow Scheduling
We consider the matroid coflow scheduling problem, where each job is comprised of a set of flows and the family of sets that can be scheduled at any time form a matroid. Our main result is a polynomial-time algorithm that yields a 2-approximation for the objective of minimizing the weighted completion time. This result is tight assuming P != NP. As a by-product we also obtain the first (2+epsilon)-approximation algorithm for the preemptive concurrent open shop scheduling problem
Recommended from our members
Resource Allocation In Large-Scale Distributed Systems
The focus of this dissertation is design and analysis of scheduling algorithms for distributed computer systems, i.e., data centers. Today’s data centers can contain thousands of servers and typically use a multi-tier switch network to provide connectivity among the servers. Data centers are the host for execution of various data-parallel applications. As an abstraction, a job in a data center can be thought of as a group of interdependent tasks, each with various requirements which need to be scheduled for execution on the servers and the data flows between the tasks that need to be scheduled in the switch network. In this thesis, we study both flow and task scheduling problems under the features of modern parallel computing frameworks.For the flow scheduling problem, we study three models.
The first model considers a general network topology where flows among the various source-destination pairs of servers are generated dynamically over time. The goal is to assign the end-to-end data flows among the available paths in order to efficiently balance the load in the network. We propose a myopic algorithm that is computationally efficient and prove that it asymptotically minimizes the total network cost using a convex optimization model, fluid limit and Lyapunov analysis. We further propose randomized versions of our myopic algorithm.
The second model consider the case that there is dependence among flows. Specifically, a coflow is defined as a collection of parallel flows whose completion time is determined by the completion time of the last flow in the collection. Our main result is a 5-approximation deterministic algorithm that schedule coflows in polynomial time so as to minimize the total weighted completion times. The key ingredient of our approach is an improved linear program formulation for sorting the coflows followed by a simple list scheduling policy.
Lastly, we study scheduling coflows of multi-stage jobs to minimize the jobs’ total weighted completion times. Each job is represented by a DAG (Directed Acyclic Graph) among its coflows that captures the dependencies among the coflows. We define g(m) = log(m)/log(log(m)) and h(m, μ) = log(mμ)/(log(log(mμ)), where m is number of servers, μ is the maximum number of coflows in a job. We develop two algorithms with approximation ratios O(√μg(m)) and O(√μg(m)h(m, μ)) for jobs with general DAGs and rooted trees, respectively. The algorithms rely on random delaying and merging optimal schedules of the coflows in the jobs’ DAG, followed by enforcing dependency among coflows and the links’ capacity constraints.
For the task scheduling problem, we study two models. We consider a setting where each job consists of a set of parallel tasks that need to be processed on different servers, and the job is completed once all its tasks finish processing. In the first model, each job is associated with a utility which is a decreasing function of its completion time. The objective is to schedule tasks in a way that achieves max-min fairness for jobs’ utilities. We first show a strong result regarding NP-hardness of this problem. We then proceed to define two notions of approximation solutions and develop scheduling algorithms that provide guarantees under these approximation notions, using dynamic programming and random perturbation of tasks’ processing times. In the second model, we further assume that processing times of tasks can be server dependent and a server can process (pack) multiple tasks at the same time subject to its capacity. We then propose three algorithms with approximation ratios of 4, (6 + ε), and 24 for different cases where preemption and migration of tasks among the servers are or are not allowed. Our algorithms use a combination of linear program relaxation and greedy packing techniques.
To demonstrate the gains in practice, we evaluate all the proposed algorithms and compare their performances with the prior approaches through extensive simulations using real and synthesized traffic traces. We hope this work inspires improvements to existing job management and scheduling in distributed computer systems
Scalable Verification of GNN-based Job Schedulers
Recently, Graph Neural Networks (GNNs) have been applied for scheduling jobs
over clusters, achieving better performance than hand-crafted heuristics.
Despite their impressive performance, concerns remain over whether these
GNN-based job schedulers meet users' expectations about other important
properties, such as strategy-proofness, sharing incentive, and stability. In
this work, we consider formal verification of GNN-based job schedulers. We
address several domain-specific challenges such as networks that are deeper and
specifications that are richer than those encountered when verifying image and
NLP classifiers. We develop vegas, the first general framework for verifying
both single-step and multi-step properties of these schedulers based on
carefully designed algorithms that combine abstractions, refinements, solvers,
and proof transfer. Our experimental results show that vegas achieves
significant speed-up when verifying important properties of a state-of-the-art
GNN-based scheduler compared to previous methods.Comment: Condensed version published at OOPSLA'2