215 research outputs found
Delay Performance and Mixing Times in Random-Access Networks
We explore the achievable delay performance in wireless random-access
networks. While relatively simple and inherently distributed in nature,
suitably designed queue-based random-access schemes provide the striking
capability to match the optimal throughput performance of centralized
scheduling mechanisms in a wide range of scenarios. The specific type of
activation rules for which throughput optimality has been established, may
however yield excessive queues and delays.
Motivated by that issue, we examine whether the poor delay performance is
inherent to the basic operation of these schemes, or caused by the specific
kind of activation rules. We derive delay lower bounds for queue-based
activation rules, which offer fundamental insight in the cause of the excessive
delays. For fixed activation rates we obtain lower bounds indicating that
delays and mixing times can grow dramatically with the load in certain
topologies as well
Scaling limits for infinite-server systems in a random environment
This paper studies the effect of an overdispersed arrival process on the
performance of an infinite-server system. In our setup, a random environment is
modeled by drawing an arrival rate from a given distribution every
time units, yielding an i.i.d. sequence of arrival rates
. Applying a martingale central limit theorem, we
obtain a functional central limit theorem for the scaled queue length process.
We proceed to large deviations and derive the logarithmic asymptotics of the
queue length's tail probabilities. As it turns out, in a rapidly changing
environment (i.e., is small relative to ) the overdispersion
of the arrival process hardly affects system behavior, whereas in a slowly
changing random environment it is fundamentally different; this general finding
applies to both the central limit and the large deviations regime. We extend
our results to the setting where each arrival creates a job in multiple
infinite-server queues
Spectral gap of the Erlang A model in the Halfin-Whitt regime
We consider a hybrid diffusion process that is a combination of two
Ornstein-Uhlenbeck processes with different restraining forces. This process
serves as the heavy-traffic approximation to the Markovian many-server queue
with abandonments in the critical Halfin-Whitt regime. We obtain an expression
for the Laplace transform of the time-dependent probability distribution, from
which the spectral gap is explicitly characterized. The spectral gap gives the
exponential rate of convergence to equilibrium. We further give various
asymptotic results for the spectral gap, in the limits of small and large
abandonment effects. It turns out that convergence to equilibrium becomes
extremely slow for overloaded systems with small abandonment effects.Comment: 48 page
Load Balancing in Large-Scale Systems with Multiple Dispatchers
Load balancing algorithms play a crucial role in delivering robust
application performance in data centers and cloud networks. Recently, strong
interest has emerged in Join-the-Idle-Queue (JIQ) algorithms, which rely on
tokens issued by idle servers in dispatching tasks and outperform power-of-
policies. Specifically, JIQ strategies involve minimal information exchange,
and yet achieve zero blocking and wait in the many-server limit. The latter
property prevails in a multiple-dispatcher scenario when the loads are strictly
equal among dispatchers. For various reasons it is not uncommon however for
skewed load patterns to occur. We leverage product-form representations and
fluid limits to establish that the blocking and wait then no longer vanish,
even for arbitrarily low overall load. Remarkably, it is the least-loaded
dispatcher that throttles tokens and leaves idle servers stranded, thus acting
as bottleneck.
Motivated by the above issues, we introduce two enhancements of the ordinary
JIQ scheme where tokens are either distributed non-uniformly or occasionally
exchanged among the various dispatchers. We prove that these extensions can
achieve zero blocking and wait in the many-server limit, for any subcritical
overall load and arbitrarily skewed load profiles. Extensive simulation
experiments demonstrate that the asymptotic results are highly accurate, even
for moderately sized systems
Scaling limits for critical inhomogeneous random graphs with finite third moments
We identify the scaling limits for the sizes of the largest components at
criticality for inhomogeneous random graphs when the degree exponent
satisfies . We see that the sizes of the (rescaled) components converge
to the excursion lengths of an inhomogeneous Brownian motion, extending results
of \cite{Aldo97}. We rely heavily on martingale convergence techniques, and
concentration properties of (super)martingales. This paper is part of a
programme to study the critical behavior in inhomogeneous random graphs of
so-called rank-1 initiated in \cite{Hofs09a}.Comment: Final versio
Hyper-Scalable JSQ with Sparse Feedback
Load balancing algorithms play a vital role in enhancing performance in data
centers and cloud networks. Due to the massive size of these systems,
scalability challenges, and especially the communication overhead associated
with load balancing mechanisms, have emerged as major concerns. Motivated by
these issues, we introduce and analyze a novel class of load balancing schemes
where the various servers provide occasional queue updates to guide the load
assignment.
We show that the proposed schemes strongly outperform JSQ() strategies
with comparable communication overhead per job, and can achieve a vanishing
waiting time in the many-server limit with just one message per job, just like
the popular JIQ scheme. The proposed schemes are particularly geared however
towards the sparse feedback regime with less than one message per job, where
they outperform corresponding sparsified JIQ versions.
We investigate fluid limits for synchronous updates as well as asynchronous
exponential update intervals. The fixed point of the fluid limit is identified
in the latter case, and used to derive the queue length distribution. We also
demonstrate that in the ultra-low feedback regime the mean stationary waiting
time tends to a constant in the synchronous case, but grows without bound in
the asynchronous case
Power series approximations for two-class generalized processor sharing systems
We develop power series approximations for a discrete-time queueing system with two parallel queues and one processor. If both queues are nonempty, a customer of queue 1 is served with probability beta, and a customer of queue 2 is served with probability 1-beta. If one of the queues is empty, a customer of the other queue is served with probability 1. We first describe the generating function U(z (1),z (2)) of the stationary queue lengths in terms of a functional equation, and show how to solve this using the theory of boundary value problems. Then, we propose to use the same functional equation to obtain a power series for U(z (1),z (2)) in beta. The first coefficient of this power series corresponds to the priority case beta=0, which allows for an explicit solution. All higher coefficients are expressed in terms of the priority case. Accurate approximations for the mean stationary queue lengths are obtained from combining truncated power series and Pad, approximation
The snowball effect of customer slowdown in critical many-server systems
Customer slowdown describes the phenomenon that a customer's service
requirement increases with experienced delay. In healthcare settings, there is
substantial empirical evidence for slowdown, particularly when a patient's
delay exceeds a certain threshold. For such threshold slowdown situations, we
design and analyze a many-server system that leads to a two-dimensional Markov
process. Analysis of this system leads to insights into the potentially
detrimental effects of slowdown, especially in heavy-traffic conditions. We
quantify the consequences of underprovisioning due to neglecting slowdown,
demonstrate the presence of a subtle bistable system behavior, and discuss in
detail the snowball effect: A delayed customer has an increased service
requirement, causing longer delays for other customers, who in turn due to
slowdown might require longer service times.Comment: 23 pages, 8 figures -- version 3 fixes a typo in an equation. in
Stochastic Models, 201
Redundancy Scheduling with Locally Stable Compatibility Graphs
Redundancy scheduling is a popular concept to improve performance in
parallel-server systems. In the baseline scenario any job can be handled
equally well by any server, and is replicated to a fixed number of servers
selected uniformly at random. Quite often however, there may be heterogeneity
in job characteristics or server capabilities, and jobs can only be replicated
to specific servers because of affinity relations or compatibility constraints.
In order to capture such situations, we consider a scenario where jobs of
various types are replicated to different subsets of servers as prescribed by a
general compatibility graph. We exploit a product-form stationary distribution
and weak local stability conditions to establish a state space collapse in
heavy traffic. In this limiting regime, the parallel-server system with
graph-based redundancy scheduling operates as a multi-class single-server
system, achieving full resource pooling and exhibiting strong insensitivity to
the underlying compatibility constraints.Comment: 28 pages, 4 figure
- …