2,703 research outputs found

    Universality of Load Balancing Schemes on Diffusion Scale

    Full text link
    We consider a system of NN parallel queues with identical exponential service rates and a single dispatcher where tasks arrive as a Poisson process. When a task arrives, the dispatcher always assigns it to an idle server, if there is any, and to a server with the shortest queue among dd randomly selected servers otherwise (1dN)(1 \leq d \leq N). This load balancing scheme subsumes the so-called Join-the-Idle Queue (JIQ) policy (d=1)(d = 1) and the celebrated Join-the-Shortest Queue (JSQ) policy (d=N)(d = N) as two crucial special cases. We develop a stochastic coupling construction to obtain the diffusion limit of the queue process in the Halfin-Whitt heavy-traffic regime, and establish that it does not depend on the value of dd, implying that assigning tasks to idle servers is sufficient for diffusion level optimality

    Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System

    Get PDF
    We consider the model of a token-based joint auto-scaling and load balancing strategy, proposed in a recent paper by Mukherjee, Dhara, Borst, and van Leeuwaarden (SIGMETRICS '17, arXiv:1703.08373), which offers an efficient scalable implementation and yet achieves asymptotically optimal steady-state delay performance and energy consumption as the number of servers NN\to\infty. In the above work, the asymptotic results are obtained under the assumption that the queues have fixed-size finite buffers, and therefore the fundamental question of stability of the proposed scheme with infinite buffers was left open. In this paper, we address this fundamental stability question. The system stability under the usual subcritical load assumption is not automatic. Moreover, the stability may not even hold for all NN. The key challenge stems from the fact that the process lacks monotonicity, which has been the powerful primary tool for establishing stability in load balancing models. We develop a novel method to prove that the subcritically loaded system is stable for large enough NN, and establish convergence of steady-state distributions to the optimal one, as NN \to \infty. The method goes beyond the state of the art techniques -- it uses an induction-based idea and a "weak monotonicity" property of the model; this technique is of independent interest and may have broader applicability.Comment: 30 page

    Asymptotically Optimal Load Balancing Topologies

    Full text link
    We consider a system of NN servers inter-connected by some underlying graph topology GNG_N. Tasks arrive at the various servers as independent Poisson processes of rate λ\lambda. Each incoming task is irrevocably assigned to whichever server has the smallest number of tasks among the one where it appears and its neighbors in GNG_N. Tasks have unit-mean exponential service times and leave the system upon service completion. The above model has been extensively investigated in the case GNG_N is a clique. Since the servers are exchangeable in that case, the queue length process is quite tractable, and it has been proved that for any λ<1\lambda < 1, the fraction of servers with two or more tasks vanishes in the limit as NN \to \infty. For an arbitrary graph GNG_N, the lack of exchangeability severely complicates the analysis, and the queue length process tends to be worse than for a clique. Accordingly, a graph GNG_N is said to be NN-optimal or N\sqrt{N}-optimal when the occupancy process on GNG_N is equivalent to that on a clique on an NN-scale or N\sqrt{N}-scale, respectively. We prove that if GNG_N is an Erd\H{o}s-R\'enyi random graph with average degree d(N)d(N), then it is with high probability NN-optimal and N\sqrt{N}-optimal if d(N)d(N) \to \infty and d(N)/(Nlog(N))d(N) / (\sqrt{N} \log(N)) \to \infty as NN \to \infty, respectively. This demonstrates that optimality can be maintained at NN-scale and N\sqrt{N}-scale while reducing the number of connections by nearly a factor NN and N/log(N)\sqrt{N} / \log(N) compared to a clique, provided the topology is suitably random. It is further shown that if GNG_N contains Θ(N)\Theta(N) bounded-degree nodes, then it cannot be NN-optimal. In addition, we establish that an arbitrary graph GNG_N is NN-optimal when its minimum degree is No(N)N - o(N), and may not be NN-optimal even when its minimum degree is cN+o(N)c N + o(N) for any 0<c<1/20 < c < 1/2.Comment: A few relevant results from arXiv:1612.00723 are included for convenienc

    Asymptotic optimality of power-of-dd load balancing in large-scale systems

    Get PDF
    We consider a system of NN identical server pools and a single dispatcher where tasks arrive as a Poisson process of rate λ(N)\lambda(N). Arriving tasks cannot be queued, and must immediately be assigned to one of the server pools to start execution, or discarded. The execution times are assumed to be exponentially distributed with unit mean, and do not depend on the number of other tasks receiving service. However, the experienced performance (e.g. in terms of received throughput) does degrade with an increasing number of concurrent tasks at the same server pool. The dispatcher therefore aims to evenly distribute the tasks across the various server pools. Specifically, when a task arrives, the dispatcher assigns it to the server pool with the minimum number of tasks among d(N)d(N) randomly selected server pools. This assignment strategy is called the JSQ(d(N))(d(N)) scheme, as it resembles the power-of-dd version of the Join-the-Shortest-Queue (JSQ) policy, and will also be referred to as such in the special case d(N)=Nd(N) = N. We construct a stochastic coupling to bound the difference in the system occupancy processes between the JSQ policy and a scheme with an arbitrary value of d(N)d(N). We use the coupling to derive the fluid limit in case d(N)d(N) \to \infty and λ(N)/Nλ\lambda(N)/N \to \lambda as NN \to \infty, along with the associated fixed point. The fluid limit turns out to be insensitive to the exact growth rate of d(N)d(N), and coincides with that for the JSQ policy. We further leverage the coupling to establish that the diffusion limit corresponds to that for the JSQ policy as well, as long as d(N)/Nlog(N)d(N)/\sqrt{N} \log(N) \to \infty, and characterize the common limiting diffusion process. These results indicate that the JSQ optimality can be preserved at the fluid-level and diffusion-level while reducing the overhead by nearly a factor O(NN) and O(N/log(N)\sqrt{N}/\log(N)), respectively

    Asymptotically optimal load balancing in large-scale heterogeneous systems with multiple dispatchers

    Get PDF
    We consider the load balancing problem in large-scale heterogeneous systems with multiple dispatchers. We introduce a general framework called Local-Estimation-Driven (LED). Under this framework, each dispatcher keeps local (possibly outdated) estimates of the queue lengths for all the servers, and the dispatching decision is made purely based on these local estimates. The local estimates are updated via infrequent communications between dispatchers and servers. We derive sufficient conditions for LED policies to achieve throughput optimality and delay optimality in heavy-traffic, respectively. These conditions directly imply delay optimality for many previous local-memory based policies in heavy traffic. Moreover, the results enable us to design new delay optimal policies for heterogeneous systems with multiple dispatchers. Finally, the heavy-traffic delay optimality of the LED framework also sheds light on a recent open question on how to design optimal load balancing schemes using delayed information

    Large-scale Join-Idle-Queue system with general service times

    Get PDF
    A parallel server system with nn identical servers is considered. The service time distribution has a finite mean 1/μ1/\mu, but otherwise is arbitrary. Arriving customers are be routed to one of the servers immediately upon arrival. Join-Idle-Queue routing algorithm is studied, under which an arriving customer is sent to an idle server, if such is available, and to a randomly uniformly chosen server, otherwise. We consider the asymptotic regime where nn\to\infty and the customer input flow rate is λn\lambda n. Under the condition λ/μ<1/2\lambda/\mu<1/2, we prove that, as nn\to\infty, the sequence of (appropriately scaled) stationary distributions concentrates at the natural equilibrium point, with the fraction of occupied servers being constant equal λ/μ\lambda/\mu. In particular, this implies that the steady-state probability of an arriving customer waiting for service vanishes.Comment: Revision. 11 page

    Hyper-Scalable JSQ with Sparse Feedback

    Full text link
    Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing mechanisms, have emerged as major concerns. Motivated by these issues, we introduce and analyze a novel class of load balancing schemes where the various servers provide occasional queue updates to guide the load assignment. We show that the proposed schemes strongly outperform JSQ(dd) strategies with comparable communication overhead per job, and can achieve a vanishing waiting time in the many-server limit with just one message per job, just like the popular JIQ scheme. The proposed schemes are particularly geared however towards the sparse feedback regime with less than one message per job, where they outperform corresponding sparsified JIQ versions. We investigate fluid limits for synchronous updates as well as asynchronous exponential update intervals. The fixed point of the fluid limit is identified in the latter case, and used to derive the queue length distribution. We also demonstrate that in the ultra-low feedback regime the mean stationary waiting time tends to a constant in the synchronous case, but grows without bound in the asynchronous case
    corecore