16 research outputs found

    Parallel Balanced Allocations: The Heavily Loaded Case

    Get PDF
    We study parallel algorithms for the classical balls-into-bins problem, in which mm balls acting in parallel as separate agents are placed into nn bins. Algorithms operate in synchronous rounds, in each of which balls and bins exchange messages once. The goal is to minimize the maximal load over all bins using a small number of rounds and few messages. While the case of m=nm=n balls has been extensively studied, little is known about the heavily loaded case. In this work, we consider parallel algorithms for this somewhat neglected regime of m≫nm\gg n. The naive solution of allocating each ball to a bin chosen uniformly and independently at random results in maximal load m/n+Θ(m/n⋅log⁥n)m/n+\Theta(\sqrt{m/n\cdot \log n}) (for m≄nlog⁥nm\geq n \log n) w.h.p. In contrast, for the sequential setting Berenbrink et al (SIAM J. Comput 2006) showed that letting each ball join the least loaded bin of two randomly selected bins reduces the maximal load to m/n+O(log⁥log⁥m)m/n+O(\log\log m) w.h.p. To date, no parallel variant of such a result is known. We present a simple parallel threshold algorithm that obtains a maximal load of m/n+O(1)m/n+O(1) w.h.p. within O(log⁥log⁥(m/n)+log⁡∗n)O(\log\log (m/n)+\log^* n) rounds. The algorithm is symmetric (balls and bins all "look the same"), and balls send O(1)O(1) messages in expectation per round. The additive term of O(log⁡∗n)O(\log^* n) in the complexity is known to be tight for such algorithms (Lenzen and Wattenhofer Distributed Computing 2016). We also prove that our analysis is tight, i.e., algorithms of the type we provide must run for Ω(min⁥{log⁥log⁥(m/n),n})\Omega(\min\{\log\log (m/n),n\}) rounds w.h.p. Finally, we give a simple asymmetric algorithm (i.e., balls are aware of a common labeling of the bins) that achieves a maximal load of m/n+O(1)m/n + O(1) in a constant number of rounds w.h.p. Again, balls send only a single message per round, and bins receive (1+o(1))m/n+O(log⁥n)(1+o(1))m/n+O(\log n) messages w.h.p

    Parallel Load Balancing on Constrained Client-Server Topologies

    Get PDF
    We study parallel \emph{Load Balancing} protocols for a client-server distributed model defined as follows. There is a set \sC of nn clients and a set \sS of nn servers where each client has (at most) a constant number d≄1d \geq 1 of requests that must be assigned to some server. The client set and the server one are connected to each other via a fixed bipartite graph: the requests of client vv can only be sent to the servers in its neighborhood N(v)N(v). The goal is to assign every client request so as to minimize the maximum load of the servers. In this setting, efficient parallel protocols are available only for dense topolgies. In particular, a simple symmetric, non-adaptive protocol achieving constant maximum load has been recently introduced by Becchetti et al \cite{BCNPT18} for regular dense bipartite graphs. The parallel completion time is \bigO(\log n) and the overall work is \bigO(n), w.h.p. Motivated by proximity constraints arising in some client-server systems, we devise a simple variant of Becchetti et al's protocol \cite{BCNPT18} and we analyse it over almost-regular bipartite graphs where nodes may have neighborhoods of small size. In detail, we prove that, w.h.p., this new version has a cost equivalent to that of Becchetti et al's protocol (in terms of maximum load, completion time, and work complexity, respectively) on every almost-regular bipartite graph with degree Ω(log⁥2n)\Omega(\log^2n). Our analysis significantly departs from that in \cite{BCNPT18} for the original protocol and requires to cope with non-trivial stochastic-dependence issues on the random choices of the algorithmic process which are due to the worst-case, sparse topology of the underlying graph

    An Improved Drift Theorem for Balanced Allocations

    Full text link
    In the balanced allocations framework, there are mm jobs (balls) to be allocated to nn servers (bins). The goal is to minimize the gap, the difference between the maximum and the average load. Peres, Talwar and Wieder (RSA 2015) used the hyperbolic cosine potential function to analyze a large family of allocation processes including the (1+ÎČ)(1+\beta)-process and graphical balanced allocations. The key ingredient was to prove that the potential drops in every step, i.e., a drift inequality. In this work we improve the drift inequality so that (i) it is asymptotically tighter, (ii) it assumes weaker preconditions, (iii) it applies not only to processes allocating to more than one bin in a single step and (iv) to processes allocating a varying number of balls depending on the sampled bin. Our applications include the processes of (RSA 2015), but also several new processes, and we believe that our techniques may lead to further results in future work.Comment: This paper refines and extends the content on the drift theorem and applications in arXiv:2203.13902. It consists of 38 pages, 7 figures, 1 tabl

    Threshold Load Balancing with Weighted Tasks

    Get PDF
    We study threshold-based load balancing protocols for weighted tasks. We are given an arbitrary graph G with n nodes (resources, bins) and m > n tasks (balls). Initially the tasks are distributed arbitrarily over the n nodes. The resources have a threshold and we are interested in the balancing time, i.e., the time it takes until the load of all resources is below the threshold. We distinguish between resource-based and user based protocols. In the case of resource-based protocols resources with a load larger than the threshold are allowed to send tasks to neighbouring resources. In the case of user-based protocols tasks allocated to resources with a load above the threshold decide on their own whether to migrate to a neighbouring resource or not. For resource-controlled protocols we present results for arbitrary graphs. Our bounds are in terms of the mixing time (for above-average thresholds) and the hitting time (for tight thresholds) of the graph. We relate the balancing time of resource-controlled protocols for above-average thresholds in arbitrary graphs to the mixing time of the graph and to the hitting time for tight thresholds. Our bounds are tight and, surprisingly, they are independent of the weights of the tasks. For the user-controlled migration we consider complete graphs and derive bounds for both above-average and tight thresholds

    Self-stabilizing Balls & Bins in Batches: The Power of Leaky Bins

    Get PDF
    A fundamental problem in distributed computing is the distribution of requests to a set of uniform servers without a centralized controller. Classically, such problems are modelled as static balls into bins processes, where m balls (tasks) are to be distributed to n bins (servers). In a seminal work, [Azar et al.; JoC'99] proposed the sequential strategy Greedy[d] for n = m. When thrown, a ball queries the load of d random bins and is allocated to a least loaded of these. [Azar et al.; JoC'99] showed that d=2 yields an exponential improvement compared to d=1. [Berenbrink et al.; JoC'06] extended this to m ⇒ n, showing that the maximal load difference is independent of m for d=2 (in contrast to d=1). We propose a new variant of an infinite balls into bins process. In each round an expected number of λ n new balls arrive and are distributed (in parallel) to the bins and each non-empty bin deletes one of its balls. This setting models a set of servers processing incoming requests, where clients can query a server's current load but receive no information about parallel requests. We study the Greedy[d] distribution scheme in this setting and show a strong self-stabilizing property: For any arrival rate λ=λ(n) < 1, the system load is time-invariant. Moreover, for any (even super-exponential) round t, the maximum system load is (w.h.p.) O(1 over 1-λ‹logn over 1-λ) for d=1 and O(log n over 1-λ) for d=2. In particular, Greedy[2] has an exponentially smaller system load for high arrival rates

    Tight bounds for parallel randomized load balancing

    Get PDF
    Given a distributed system of n balls and n bins, how evenly can we distribute the balls to the bins, minimizing communication? The fastest non-adaptive and symmetric algorithm achieving a constant maximum bin load requires Θ(loglogn) rounds, and any such algorithm running for r∈O(1) rounds incurs a bin load of Ω((logn/loglogn)1/r). In this work, we explore the fundamental limits of the general problem. We present a simple adaptive symmetric algorithm that achieves a bin load of 2 in log∗n+O(1) communication rounds using O(n) messages in total. Our main result, however, is a matching lower bound of (1−o(1))log∗n on the time complexity of symmetric algorithms that guarantee small bin loads. The essential preconditions of the proof are (i) a limit of O(n) on the total number of messages sent by the algorithm and (ii) anonymity of bins, i.e., the port numberings of balls need not be globally consistent. In order to show that our technique yields indeed tight bounds, we provide for each assumption an algorithm violating it, in turn achieving a constant maximum bin load in constant time.German Research Foundation (DFG, reference number Le 3107/1-1)Society of Swiss Friends of the Weizmann Institute of ScienceSwiss National Fun

    Parallel Load Balancing on constrained client-server topologies

    Get PDF
    We study parallel Load Balancing protocols for the client-server distributed model defined as follows. There is a set of n clients and a set of n servers where each client has (at most) a constant number of requests that must be assigned to some server. The client set and the server one are connected to each other via a fixed bipartite graph: the requests of client v can only be sent to the servers in its neighborhood. The goal is to assign every client request so as to minimize the maximum load of the servers. In this setting, efficient parallel protocols are available only for dense topologies. In particular, a simple protocol, named raes, has been recently introduced by Becchetti et al. [1] for regular dense bipartite graphs. They show that this symmetric, non-adaptive protocol achieves constant maximum load with parallel completion time and overall work, w.h.p. Motivated by proximity constraints arising in some client-server systems, we analyze raes over almost-regular bipartite graphs where nodes may have neighborhoods of small size. In detail, we prove that, w.h.p., the raes protocol keeps the same performances as above (in terms of maximum load, completion time, and work complexity, respectively) on any almost-regular bipartite graph with degree. Our analysis significantly departs from that in [1] since it requires to cope with non-trivial stochastic-dependence issues on the random choices of the algorithmic process which are due to the worst-case, sparse topology of the underlying graph
    corecore