Search CORE

4 research outputs found

Determining the optimal redistribution

Author: Herrmann Julien
Hérault Thomas
Marchal Loris
Robert Yves
Publication venue: HAL CCSD
Publication date: 18/03/2014
Field of study

The classical redistribution problem aims at optimally scheduling communications when moving from an initial data distribution \Dini to a target distribution \Dtar where each processor

P_{i}

will host a subset

P(i)

of data items. However, modern computing platforms are equipped with a powerful interconnection switch, and the cost of a given communication is (almost) independent of the location of its sender and receiver. This leads to generalizing the redistribution problem as follows: find the optimal permutation

\sigma

of processors such that

P_{i}

will host the set

P(\sigma(i))

, and for which the cost of the redistribution is minimal. This report studies the complexity of this generalized problem. We provide optimal algorithms and evaluate their gain over classical redistribution through simulations. We also show the NP-hardness of the problem to find the optimal data partition and processor permutation (defined by new subsets

P(\sigma(i))

) that minimize the cost of redistribution followed by a simple computation kernel.Le problème de redistribution classique consiste à ordonnancer les communications de manière optimale lorsque l'on passe une distribution de données initiale \Dini à une distribution cible \Dtar où chaque processeur

P_{i}

héberge un sous-ensemble

P(i)

des données. Cependant, les plates-formes de calcul modernes sont équipées de puissants réseaux d'interconnexion programmables, et le coût d'une communication donnée est (presque) indépendant de l'emplacement de l'expéditeur et du récepteur. Cela conduit à généraliser le problème de redistribution comme suit: trouver la permutation optimale

\sigma

de processeurs telle que

P_{i}

héberge l'ensemble

P(\sigma(i))

, et telle que le coût de redistribution soit minimal. Ce rapport étudie la complexité de ce problème généralisé. Nous proposons des algorithmes optimaux et évaluons leur gain par rapport à la redistribution classique, via quelques simulations. Nous montrons aussi la NP-completude du problème consistant à trouver la partition de données optimale et la permutation des processeurs (définie par les nouveaux sous-ensembles

P(\sigma(i))

) qui minimise le coût de la redistribution suivie d'un noyau de calcul simple

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Assessing the cost of redistribution followed by a computational kernel: Complexity and performance results

Author: Anderson
Beaumont
Bhat
Bosilca
Bosilca
Bosilca
Buttari
Choi
Coffman
Desprez
Dongarra
El-Rewini
Garey
George Bosilca
Guo
Herault
Hollermann
Hopcroft
Hsu
Jack Dongarra
Julien Herrmann
Kalns
Kim
Koelbel
Loris Marchal
Norman
Prylli
Quintana-Ortí
Rivera-Vega
Schrijver
Shirazi
Smith
Stonebraker
Thakur
Thomas Hérault
Walker
Wang
Yves Robert
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Recommended from our members

Resource Allocation In Large-Scale Distributed Systems

Author: Shafiee Mehrnoosh
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2021
Field of study

The focus of this dissertation is design and analysis of scheduling algorithms for distributed computer systems, i.e., data centers. Today’s data centers can contain thousands of servers and typically use a multi-tier switch network to provide connectivity among the servers. Data centers are the host for execution of various data-parallel applications. As an abstraction, a job in a data center can be thought of as a group of interdependent tasks, each with various requirements which need to be scheduled for execution on the servers and the data flows between the tasks that need to be scheduled in the switch network. In this thesis, we study both flow and task scheduling problems under the features of modern parallel computing frameworks.For the flow scheduling problem, we study three models. The first model considers a general network topology where flows among the various source-destination pairs of servers are generated dynamically over time. The goal is to assign the end-to-end data flows among the available paths in order to efficiently balance the load in the network. We propose a myopic algorithm that is computationally efficient and prove that it asymptotically minimizes the total network cost using a convex optimization model, fluid limit and Lyapunov analysis. We further propose randomized versions of our myopic algorithm. The second model consider the case that there is dependence among flows. Specifically, a coflow is defined as a collection of parallel flows whose completion time is determined by the completion time of the last flow in the collection. Our main result is a 5-approximation deterministic algorithm that schedule coflows in polynomial time so as to minimize the total weighted completion times. The key ingredient of our approach is an improved linear program formulation for sorting the coflows followed by a simple list scheduling policy. Lastly, we study scheduling coflows of multi-stage jobs to minimize the jobs’ total weighted completion times. Each job is represented by a DAG (Directed Acyclic Graph) among its coflows that captures the dependencies among the coflows. We define g(m) = log(m)/log(log(m)) and h(m, μ) = log(mμ)/(log(log(mμ)), where m is number of servers, μ is the maximum number of coflows in a job. We develop two algorithms with approximation ratios O(√μg(m)) and O(√μg(m)h(m, μ)) for jobs with general DAGs and rooted trees, respectively. The algorithms rely on random delaying and merging optimal schedules of the coflows in the jobs’ DAG, followed by enforcing dependency among coflows and the links’ capacity constraints. For the task scheduling problem, we study two models. We consider a setting where each job consists of a set of parallel tasks that need to be processed on different servers, and the job is completed once all its tasks finish processing. In the first model, each job is associated with a utility which is a decreasing function of its completion time. The objective is to schedule tasks in a way that achieves max-min fairness for jobs’ utilities. We first show a strong result regarding NP-hardness of this problem. We then proceed to define two notions of approximation solutions and develop scheduling algorithms that provide guarantees under these approximation notions, using dynamic programming and random perturbation of tasks’ processing times. In the second model, we further assume that processing times of tasks can be server dependent and a server can process (pack) multiple tasks at the same time subject to its capacity. We then propose three algorithms with approximation ratios of 4, (6 + ε), and 24 for different cases where preemption and migration of tasks among the servers are or are not allowed. Our algorithms use a combination of linear program relaxation and greedy packing techniques. To demonstrate the gains in practice, we evaluate all the proposed algorithms and compare their performances with the prior approaches through extensive simulations using real and synthesized traffic traces. We hope this work inspires improvements to existing job management and scheduling in distributed computer systems

Columbia University Academic Commons