Search CORE

28 research outputs found

Weighted Scheduling of Time-Sensitive Coflows

Author: Brun Olivier
De Pellergrini Francesco
El-Azouzi Rachid
Luu Quang-Trung
Prabhu Balakrishna J.
Richier Cédric
Publication venue
Publication date: 22/10/2023
Field of study

Datacenter networks commonly facilitate the transmission of data in distributed computing frameworks through coflows, which are collections of parallel flows associated with a common task. Most of the existing research has concentrated on scheduling coflows to minimize the time required for their completion, i.e., to optimize the average dispatch rate of coflows in the network fabric. Nevertheless, modern applications often produce coflows that are specifically intended for online services and mission-crucial computational tasks, necessitating adherence to specific deadlines for their completion. In this paper, we introduce \wdcoflow,~ a new algorithm to maximize the weighted number of coflows that complete before their deadline. By combining a dynamic programming algorithm along with parallel inequalities, our heuristic solution performs at once coflow admission control and coflow prioritization, imposing a

\sigma

-order on the set of coflows. With extensive simulation, we demonstrate the effectiveness of our algorithm in improving up to

3\times

more coflows that meet their deadline in comparison the best SoA solution, namely

\mathtt{CS\text{-}MHA}

. Furthermore, when weights are used to differentiate coflow classes, \wdcoflow~ is able to improve the admission per class up to

4\times

, while increasing the average weighted coflow admission rate.Comment: Submitted to IEEE Transactions on Cloud Computing. Parts of this work have been presented at IFIP Networking 202

arXiv.org e-Print Archive

Asymptotically Optimal Approximation Algorithms for Coflow Scheduling

Author: Ahuja R. K.
Al-Fares M.
Al-Fares Mohammad
Peis B.
Zaharia M.
Zhao Y.
Publication venue
Publication date: 08/03/2018
Field of study

Many modern datacenter applications involve large-scale computations composed of multiple data flows that need to be completed over a shared set of distributed resources. Such a computation completes when all of its flows complete. A useful abstraction for modeling such scenarios is a {\em coflow}, which is a collection of flows (e.g., tasks, packets, data transmissions) that all share the same performance goal. In this paper, we present the first approximation algorithms for scheduling coflows over general network topologies with the objective of minimizing total weighted completion time. We consider two different models for coflows based on the nature of individual flows: circuits, and packets. We design constant-factor polynomial-time approximation algorithms for scheduling packet-based coflows with or without given flow paths, and circuit-based coflows with given flow paths. Furthermore, we give an

O(\log n/\log \log n)

-approximation polynomial time algorithm for scheduling circuit-based coflows where flow paths are not given (here

n

is the number of network edges). We obtain our results by developing a general framework for coflow schedules, based on interval-indexed linear programs, which may extend to other coflow models and objective functions and may also yield improved approximation bounds for specific network scenarios. We also present an experimental evaluation of our approach for circuit-based coflows that show a performance improvement of at least 22% on average over competing heuristics.Comment: Fixed minor typo

arXiv.org e-Print Archive

Crossref

ON SCHEDULING AND COMMUNICATION ISSUES IN DATA CENTERS

Author: Yang Sheng
Publication venue
Publication date: 01/01/2020
Field of study

The proliferation of datacenters to handle the rapidly growing amount of data being managed in the cloud, necessitates the design, management and effective utilization of the thousands of machines that constitute a data center. Many modern big data applications require access to a large number of machines and datasets for training neural nets or for other big data processing. In this thesis, we present research challenges and progress along two fronts. The first challenge addresses the need to schedule communication between machines in a much more effective manner, as several running applications compete for network bandwidth. We address a basic question known as coflow scheduling to optimize the weighted average completion time of tasks that are running across different machines in a datacenter and to effectively handle their communication needs. Sometimes, we are forced to distribute a task among multiple datacenters due to cost or legal reasons. For this case, we also study a related model that addresses communication needs of tasks that process data on multiple data centers and handles communication requirements of such tasks across a wide area network with possibly widely varying bandwidth and network structures across different pairs of machines. The second challenge is from a cloud user's perspective - since access to resources such as those provided by Amazon AWS can be expensive at scale, cloud computing providers often sell under utilized resources at a significant discount via a spot instance market. However, these instances are not dedicated and while they offer a cheaper alternative, there is a chance that the user's job will be interrupted to make room for higher priority tasks. Certain non-critical applications are not significantly impacted by delays due to interruptions, and we develop an initial framework to study some basic scheduling questions under this circumstance. In all of these topics, the problems we study are NP-hard and our focus is on developing good approximation algorithms. In addition, while we attack these problems from a theoretical perspective, all the algorithms developed in this thesis are practical and efficient, and can be easily deployed in practice, some are already deployed

Digital Repository at the University of Maryland

Scheduling Coflows for Minimizing the Total Weighted Completion Time in Heterogeneous Parallel Networks

Author: Chen Chi-Yeh
Publication venue
Publication date: 16/04/2022
Field of study

Coflow is a network abstraction used to represent communication patterns in data centers. The coflow scheduling problem in large data centers is one of the most important

NP

-hard problems. Many previous studies on coflow scheduling mainly focus on the single-core model. However, with the growth of data centers, this single-core model is no longer sufficient. This paper considers the coflow scheduling problem in heterogeneous parallel networks. The heterogeneous parallel network is an architecture based on multiple network cores running in parallel. In this paper, two polynomial-time approximation algorithms are developed for scheduling divisible and indivisible coflows in heterogeneous parallel networks, respectively. Both algorithms achieve an approximation ratio of

O(\log m/ \log \log m)

with arbitrary release times.Comment: arXiv admin note: text overlap with arXiv:2204.0265

arXiv.org e-Print Archive

Scheduling Coflows for Minimizing the Total Weighted Completion Time in Identical Parallel Networks

Author: Chen Chi-Yeh
Publication venue
Publication date: 06/04/2022
Field of study

Coflow is a recently proposed network abstraction to capture communication patterns in data centers. The coflow scheduling problem in large data centers is one of the most important

NP

-hard problems. Previous research on coflow scheduling focused mainly on the single-switch model. However, with recent technological developments, this single-core model is no longer sufficient. This paper considers the coflow scheduling problem in identical parallel networks. The identical parallel network is an architecture based on multiple network cores running in parallel. Coflow can be considered as divisible or indivisible. Different flows in a divisible coflow can be transmitted through different network cores. Considering the divisible coflow scheduling problem, we propose a

(6-\frac{2}{m})

-approximation algorithm with arbitrary release times, and a

(5-\frac{2}{m})

-approximation without release time, where

m

is the number of network cores. On the other hand, when coflow is indivisible, we propose a

(7-\frac{2}{m})

-approximation algorithm with arbitrary release times, and a

(6-\frac{2}{m})

-approximation without release time

arXiv.org e-Print Archive

Matroid Coflow Scheduling

Author: Im Sungjin
Moseley Benjamin
Pruhs Kirk
Purohit Manish
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)
Publication date: 01/01/2019
Field of study

We consider the matroid coflow scheduling problem, where each job is comprised of a set of flows and the family of sets that can be scheduled at any time form a matroid. Our main result is a polynomial-time algorithm that yields a 2-approximation for the objective of minimizing the weighted completion time. This result is tight assuming P != NP. As a by-product we also obtain the first (2+epsilon)-approximation algorithm for the preemptive concurrent open shop scheduling problem

Dagstuhl Research Online Publication Server