16 research outputs found
Parallel Balanced Allocations: The Heavily Loaded Case
We study parallel algorithms for the classical balls-into-bins problem, in
which balls acting in parallel as separate agents are placed into bins.
Algorithms operate in synchronous rounds, in each of which balls and bins
exchange messages once. The goal is to minimize the maximal load over all bins
using a small number of rounds and few messages.
While the case of balls has been extensively studied, little is known
about the heavily loaded case. In this work, we consider parallel algorithms
for this somewhat neglected regime of . The naive solution of
allocating each ball to a bin chosen uniformly and independently at random
results in maximal load (for ) w.h.p. In contrast, for the sequential setting Berenbrink et al (SIAM J.
Comput 2006) showed that letting each ball join the least loaded bin of two
randomly selected bins reduces the maximal load to w.h.p.
To date, no parallel variant of such a result is known.
We present a simple parallel threshold algorithm that obtains a maximal load
of w.h.p. within rounds. The algorithm
is symmetric (balls and bins all "look the same"), and balls send
messages in expectation per round. The additive term of in the
complexity is known to be tight for such algorithms (Lenzen and Wattenhofer
Distributed Computing 2016). We also prove that our analysis is tight, i.e.,
algorithms of the type we provide must run for rounds w.h.p.
Finally, we give a simple asymmetric algorithm (i.e., balls are aware of a
common labeling of the bins) that achieves a maximal load of in a
constant number of rounds w.h.p. Again, balls send only a single message per
round, and bins receive messages w.h.p
Parallel Load Balancing on Constrained Client-Server Topologies
We study parallel \emph{Load Balancing} protocols for a client-server
distributed model defined as follows.
There is a set \sC of clients and a set \sS of servers where each
client has
(at most) a constant number of requests that must be assigned to
some server. The client set and the server one are connected to each other via
a fixed bipartite graph: the requests of client can only be sent to the
servers in its neighborhood . The goal is to assign every client request
so as to minimize the maximum load of the servers.
In this setting, efficient parallel protocols are available only for dense
topolgies. In particular, a simple symmetric, non-adaptive protocol achieving
constant maximum load has been recently introduced by Becchetti et al
\cite{BCNPT18} for regular dense bipartite graphs. The parallel completion time
is \bigO(\log n) and the overall work is \bigO(n), w.h.p.
Motivated by proximity constraints arising in some client-server systems, we
devise a simple variant of Becchetti et al's protocol \cite{BCNPT18} and we
analyse it over almost-regular bipartite graphs where nodes may have
neighborhoods of small size. In detail, we prove that, w.h.p., this new version
has a cost equivalent to that of Becchetti et al's protocol (in terms of
maximum load, completion time, and work complexity, respectively) on every
almost-regular bipartite graph with degree .
Our analysis significantly departs from that in \cite{BCNPT18} for the
original protocol and requires to cope with non-trivial stochastic-dependence
issues on the random choices of the algorithmic process which are due to the
worst-case, sparse topology of the underlying graph
An Improved Drift Theorem for Balanced Allocations
In the balanced allocations framework, there are jobs (balls) to be
allocated to servers (bins). The goal is to minimize the gap, the
difference between the maximum and the average load.
Peres, Talwar and Wieder (RSA 2015) used the hyperbolic cosine potential
function to analyze a large family of allocation processes including the
-process and graphical balanced allocations. The key ingredient was
to prove that the potential drops in every step, i.e., a drift inequality.
In this work we improve the drift inequality so that (i) it is asymptotically
tighter, (ii) it assumes weaker preconditions, (iii) it applies not only to
processes allocating to more than one bin in a single step and (iv) to
processes allocating a varying number of balls depending on the sampled bin.
Our applications include the processes of (RSA 2015), but also several new
processes, and we believe that our techniques may lead to further results in
future work.Comment: This paper refines and extends the content on the drift theorem and
applications in arXiv:2203.13902. It consists of 38 pages, 7 figures, 1 tabl
Threshold Load Balancing with Weighted Tasks
We study threshold-based load balancing protocols for weighted tasks. We are given an arbitrary graph G with n nodes (resources, bins) and m > n tasks (balls). Initially the tasks are distributed arbitrarily over the n nodes. The resources have a threshold and we are interested in the balancing time, i.e., the time it takes until the load of all resources is below the threshold. We distinguish between resource-based and user based protocols. In the case of resource-based protocols resources with a load larger than the threshold are allowed to send tasks to neighbouring resources. In the case of user-based protocols tasks allocated to resources with a load above the threshold decide on their own whether to migrate to a neighbouring resource or not. For resource-controlled protocols we present results for arbitrary graphs. Our bounds are in terms of the mixing time (for above-average thresholds) and the hitting time (for tight thresholds) of the graph. We relate the balancing time of resource-controlled protocols for above-average thresholds in arbitrary graphs to the mixing time of the graph and to the hitting time for tight thresholds. Our bounds are tight and, surprisingly, they are independent of the weights of the tasks. For the user-controlled migration we consider complete graphs and derive bounds for both above-average and tight thresholds
Self-stabilizing Balls & Bins in Batches: The Power of Leaky Bins
A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modelled as static balls into bins processes, where m balls (tasks) are to be distributed to n bins (servers). In a seminal work, [Azar et al.; JoC'99] proposed the sequential strategy Greedy[d] for n = m. When thrown, a ball queries the load of d random bins and is allocated to a least loaded of these. [Azar et al.; JoC'99] showed that d=2 yields an exponential improvement compared to d=1. [Berenbrink et al.; JoC'06] extended this to m â n, showing that the maximal load difference is independent of m for d=2 (in contrast to d=1). We propose a new variant of an infinite balls into bins process. In each round an expected number of λ n new balls arrive and are distributed (in parallel) to the bins and each non-empty bin deletes one of its balls. This setting models a set of servers processing incoming requests, where clients can query a server's current load but receive no information about parallel requests. We study the Greedy[d] distribution scheme in this setting and show a strong self-stabilizing property: For any arrival rate λ=λ(n) < 1, the system load is time-invariant. Moreover, for any (even super-exponential) round t, the maximum system load is (w.h.p.) O(1 over 1-λâąlogn over 1-λ) for d=1 and O(log n over 1-λ) for d=2. In particular, Greedy[2] has an exponentially smaller system load for high arrival rates
Tight bounds for parallel randomized load balancing
Given a distributed system of n balls and n bins, how evenly can we distribute the balls to the bins, minimizing communication? The fastest non-adaptive and symmetric algorithm achieving a constant maximum bin load requires Î(loglogn) rounds, and any such algorithm running for râO(1) rounds incurs a bin load of Ω((logn/loglogn)1/r). In this work, we explore the fundamental limits of the general problem. We present a simple adaptive symmetric algorithm that achieves a bin load of 2 in logân+O(1) communication rounds using O(n) messages in total. Our main result, however, is a matching lower bound of (1âo(1))logân on the time complexity of symmetric algorithms that guarantee small bin loads. The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls need not be globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time.German Research Foundation (DFG, reference number Le 3107/1-1)Society of Swiss Friends of the Weizmann Institute of ScienceSwiss National Fun
Parallel Load Balancing on constrained client-server topologies
We study parallel Load Balancing protocols for the client-server distributed model defined as follows. There is a set of n clients and a set
of n servers where each client has (at most) a constant number of requests that must be assigned to some server. The client set and the server one are connected to each other via a fixed bipartite graph: the requests of client v can only be sent to the servers in its neighborhood. The goal is to assign every client request so as to minimize the maximum load of the servers.
In this setting, efficient parallel protocols are available only for dense topologies. In particular, a simple protocol, named raes, has been recently introduced by Becchetti et al. [1] for regular dense bipartite graphs. They show that this symmetric, non-adaptive protocol achieves constant maximum load with parallel completion time
and overall work, w.h.p.
Motivated by proximity constraints arising in some client-server systems, we analyze raes over almost-regular bipartite graphs where nodes may have neighborhoods of small size. In detail, we prove that, w.h.p., the raes protocol keeps the same performances as above (in terms of maximum load, completion time, and work complexity, respectively) on any almost-regular bipartite graph with degree.
Our analysis significantly departs from that in [1] since it requires to cope with non-trivial stochastic-dependence issues on the random choices of the algorithmic process which are due to the worst-case, sparse topology of the underlying graph