45 research outputs found
Analysis of randomized join-the-shortest-queue (JSQ) schemes in large heterogeneous processor-sharing systems
In this paper, we investigate the stability and performance
of randomized dynamic routing schemes for jobs based on
the Join-the-Shortest Queue (JSQ) criterion in a heterogeneous
system of many parallel servers. In particular, we consider servers
that use processor sharing but with different server rates, and
jobs are routed to the server with the smallest occupancy among
a finite number of randomly sampled servers. We focus on the
case of two servers that is often referred to as a Power-of-Two
scheme. We first show that in the heterogeneous setting, uniform
sampling of servers can cause a loss in the stability region and thus
such randomized dynamic schemes need not outperform static
randomized schemes in terms of mean delay in opposition to
the homogeneous case of equal server speeds where the stability
region is maximal and coincides with that of the static randomized
routing. We explicitly characterize the stationary distributions
of the server occupancies and show that the tail distribution
of the server occupancy has a super-exponential behavior as in
the homogeneous case as the number of servers goes to infinity.
To overcome the stability issue, we show that it is possible to
combine the static state-independent scheme with a randomized
JSQ scheme that allows us to recover the maximal stability region
combined with the benefits of JSQ, and such a scheme is preferable
in terms of average delay. The techniques are based on a mean field
analysis where we show that the stationary distributions coincide
with those obtained under asymptotic independence of the servers
and, moreover, the stationary distributions are insensitive to the
job-size distribution
Hyper-Scalable JSQ with Sparse Feedback
Load balancing algorithms play a vital role in enhancing performance in data
centers and cloud networks. Due to the massive size of these systems,
scalability challenges, and especially the communication overhead associated
with load balancing mechanisms, have emerged as major concerns. Motivated by
these issues, we introduce and analyze a novel class of load balancing schemes
where the various servers provide occasional queue updates to guide the load
assignment.
We show that the proposed schemes strongly outperform JSQ() strategies
with comparable communication overhead per job, and can achieve a vanishing
waiting time in the many-server limit with just one message per job, just like
the popular JIQ scheme. The proposed schemes are particularly geared however
towards the sparse feedback regime with less than one message per job, where
they outperform corresponding sparsified JIQ versions.
We investigate fluid limits for synchronous updates as well as asynchronous
exponential update intervals. The fixed point of the fluid limit is identified
in the latter case, and used to derive the queue length distribution. We also
demonstrate that in the ultra-low feedback regime the mean stationary waiting
time tends to a constant in the synchronous case, but grows without bound in
the asynchronous case
Load Balancing in the Non-Degenerate Slowdown Regime
We analyse Join-the-Shortest-Queue in a contemporary scaling regime known as
the Non-Degenerate Slowdown regime. Join-the-Shortest-Queue (JSQ) is a
classical load balancing policy for queueing systems with multiple parallel
servers. Parallel server queueing systems are regularly analysed and
dimensioned by diffusion approximations achieved in the Halfin-Whitt scaling
regime. However, when jobs must be dispatched to a server upon arrival, we
advocate the Non-Degenerate Slowdown regime (NDS) to compare different
load-balancing rules.
In this paper we identify novel diffusion approximation and timescale
separation that provides insights into the performance of JSQ. We calculate the
price of irrevocably dispatching jobs to servers and prove this to within 15%
(in the NDS regime) of the rules that may manoeuvre jobs between servers. We
also compare ours results for the JSQ policy with the NDS approximations of
many modern load balancing policies such as Idle-Queue-First and
Power-of--choices policies which act as low information proxies for the JSQ
policy. Our analysis leads us to construct new rules that have identical
performance to JSQ but require less communication overhead than
power-of-2-choices.Comment: Revised journal submission versio
Randomized Assignment of Jobs to Servers in Heterogeneous Clusters of Shared Servers for Low Delay
We consider the job assignment problem in a multi-server system consisting of
parallel processor sharing servers, categorized into ()
different types according to their processing capacity or speed. Jobs of random
sizes arrive at the system according to a Poisson process with rate . Upon each arrival, a small number of servers from each type is
sampled uniformly at random. The job is then assigned to one of the sampled
servers based on a selection rule. We propose two schemes, each corresponding
to a specific selection rule that aims at reducing the mean sojourn time of
jobs in the system.
We first show that both methods achieve the maximal stability region. We then
analyze the system operating under the proposed schemes as which
corresponds to the mean field. Our results show that asymptotic independence
among servers holds even when is finite and exchangeability holds only
within servers of the same type. We further establish the existence and
uniqueness of stationary solution of the mean field and show that the tail
distribution of server occupancy decays doubly exponentially for each server
type. When the estimates of arrival rates are not available, the proposed
schemes offer simpler alternatives to achieving lower mean sojourn time of
jobs, as shown by our numerical studies
Scalable Load Balancing Algorithms in Networked Systems
A fundamental challenge in large-scale networked systems viz., data centers
and cloud networks is to distribute tasks to a pool of servers, using minimal
instantaneous state information, while providing excellent delay performance.
In this thesis we design and analyze load balancing algorithms that aim to
achieve a highly efficient distribution of tasks, optimize server utilization,
and minimize communication overhead.Comment: Ph.D. thesi