2,707 research outputs found
Randomized load balancing in finite regimes
Randomized load balancing is a cost efficient policy for job scheduling in parallel server queueing systems whereby, with every incoming job, a central dispatcher randomly polls some servers and selects the one with the smallest queue. By exactly deriving the jobs' delay distribution in such systems, in explicit and closed form, Mitzenmacher~\cite{Mi03} proved the so-called `power-of-two' result, which states that by randomly polling only two servers yields an exponential improvement in delay over randomly selecting a single server. Such a fundamental result, however, was obtained in an asymptotic regime in the total number of servers, and does do not necessarily provide accurate estimates for practical finite regimes with small or moderate number of servers. In this paper we obtain stochastic lower and upper bounds on the jobs' average delay in non-asymptotic/finite regimes, by borrowing ideas for analyzing the particular case of Join-the-Shortest-Queue (JSQ) policy. Numerical illustrations indicate not only that the obtained (lower) bounds are remarkably accurate, but also that the existing exact but asymptotic results can be largely misleading in finite regimes (e.g., by more than in the case of servers)
Steady-State Analysis of Load Balancing with Coxian- Distributed Service Times
This paper studies load balancing for many-server ( servers) systems. Each
server has a buffer of size and can have at most one job in service and
jobs in the buffer. The service time of a job follows the Coxian-2
distribution. We focus on steady-state performance of load balancing policies
in the heavy traffic regime such that the normalized load of system is for We identify a set of policies that
achieve asymptotic zero waiting. The set of policies include several classical
policies such as join-the-shortest-queue (JSQ), join-the-idle-queue (JIQ),
idle-one-first (I1F) and power-of--choices (Po) with . The proof of the main result is based on Stein's method and state space
collapse. A key technical contribution of this paper is the iterative state
space collapse approach that leads to a simple generator approximation when
applying Stein's method
Load Balancing in the Non-Degenerate Slowdown Regime
We analyse Join-the-Shortest-Queue in a contemporary scaling regime known as
the Non-Degenerate Slowdown regime. Join-the-Shortest-Queue (JSQ) is a
classical load balancing policy for queueing systems with multiple parallel
servers. Parallel server queueing systems are regularly analysed and
dimensioned by diffusion approximations achieved in the Halfin-Whitt scaling
regime. However, when jobs must be dispatched to a server upon arrival, we
advocate the Non-Degenerate Slowdown regime (NDS) to compare different
load-balancing rules.
In this paper we identify novel diffusion approximation and timescale
separation that provides insights into the performance of JSQ. We calculate the
price of irrevocably dispatching jobs to servers and prove this to within 15%
(in the NDS regime) of the rules that may manoeuvre jobs between servers. We
also compare ours results for the JSQ policy with the NDS approximations of
many modern load balancing policies such as Idle-Queue-First and
Power-of--choices policies which act as low information proxies for the JSQ
policy. Our analysis leads us to construct new rules that have identical
performance to JSQ but require less communication overhead than
power-of-2-choices.Comment: Revised journal submission versio
Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System
We consider the model of a token-based joint auto-scaling and load balancing
strategy, proposed in a recent paper by Mukherjee, Dhara, Borst, and van
Leeuwaarden (SIGMETRICS '17, arXiv:1703.08373), which offers an efficient
scalable implementation and yet achieves asymptotically optimal steady-state
delay performance and energy consumption as the number of servers .
In the above work, the asymptotic results are obtained under the assumption
that the queues have fixed-size finite buffers, and therefore the fundamental
question of stability of the proposed scheme with infinite buffers was left
open. In this paper, we address this fundamental stability question. The system
stability under the usual subcritical load assumption is not automatic.
Moreover, the stability may not even hold for all . The key challenge stems
from the fact that the process lacks monotonicity, which has been the powerful
primary tool for establishing stability in load balancing models. We develop a
novel method to prove that the subcritically loaded system is stable for large
enough , and establish convergence of steady-state distributions to the
optimal one, as . The method goes beyond the state of the art
techniques -- it uses an induction-based idea and a "weak monotonicity"
property of the model; this technique is of independent interest and may have
broader applicability.Comment: 30 page
Delay, memory, and messaging tradeoffs in distributed service systems
We consider the following distributed service model: jobs with unit mean,
exponentially distributed, and independent processing times arrive as a Poisson
process of rate , with , and are immediately dispatched
by a centralized dispatcher to one of First-In-First-Out queues associated
with identical servers. The dispatcher is endowed with a finite memory, and
with the ability to exchange messages with the servers.
We propose and study a resource-constrained "pull-based" dispatching policy
that involves two parameters: (i) the number of memory bits available at the
dispatcher, and (ii) the average rate at which servers communicate with the
dispatcher. We establish (using a fluid limit approach) that the asymptotic, as
, expected queueing delay is zero when either (i) the number of
memory bits grows logarithmically with and the message rate grows
superlinearly with , or (ii) the number of memory bits grows
superlogarithmically with and the message rate is at least .
Furthermore, when the number of memory bits grows only logarithmically with
and the message rate is proportional to , we obtain a closed-form expression
for the (now positive) asymptotic delay.
Finally, we demonstrate an interesting phase transition in the
resource-constrained regime where the asymptotic delay is non-zero. In
particular, we show that for any given (no matter how small), if our
policy only uses a linear message rate , the resulting asymptotic
delay is upper bounded, uniformly over all ; this is in sharp
contrast to the delay obtained when no messages are used (), which
grows as when , or when the popular
power-of--choices is used, in which the delay grows as
- …