2,965 research outputs found
Many-server diffusion limits for queues
This paper studies many-server limits for multi-server queues that have a
phase-type service time distribution and allow for customer abandonment. The
first set of limit theorems is for critically loaded queues, where
the patience times are independent and identically distributed following a
general distribution. The next limit theorem is for overloaded
queues, where the patience time distribution is restricted to be exponential.
We prove that a pair of diffusion-scaled total-customer-count and
server-allocation processes, properly centered, converges in distribution to a
continuous Markov process as the number of servers goes to infinity. In the
overloaded case, the limit is a multi-dimensional diffusion process, and in the
critically loaded case, the limit is a simple transformation of a diffusion
process. When the queues are critically loaded, our diffusion limit generalizes
the result by Puhalskii and Reiman (2000) for queues without customer
abandonment. When the queues are overloaded, the diffusion limit provides a
refinement to a fluid limit and it generalizes a result by Whitt (2004) for
queues with an exponential service time distribution. The proof
techniques employed in this paper are innovative. First, a perturbed system is
shown to be equivalent to the original system. Next, two maps are employed in
both fluid and diffusion scalings. These maps allow one to prove the limit
theorems by applying the standard continuous-mapping theorem and the standard
random-time-change theorem.Comment: Published in at http://dx.doi.org/10.1214/09-AAP674 the Annals of
Applied Probability (http://www.imstat.org/aap/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Separation of timescales in a two-layered network
We investigate a computer network consisting of two layers occurring in, for
example, application servers. The first layer incorporates the arrival of jobs
at a network of multi-server nodes, which we model as a many-server Jackson
network. At the second layer, active servers at these nodes act now as
customers who are served by a common CPU. Our main result shows a separation of
time scales in heavy traffic: the main source of randomness occurs at the
(aggregate) CPU layer; the interactions between different types of nodes at the
other layer is shown to converge to a fixed point at a faster time scale; this
also yields a state-space collapse property. Apart from these fundamental
insights, we also obtain an explicit approximation for the joint law of the
number of jobs in the system, which is provably accurate for heavily loaded
systems and performs numerically well for moderately loaded systems. The
obtained results for the model under consideration can be applied to
thread-pool dimensioning in application servers, while the technique seems
applicable to other layered systems too.Comment: 8 pages, 2 figures, 1 table, ITC 24 (2012
Redundancy Scheduling with Locally Stable Compatibility Graphs
Redundancy scheduling is a popular concept to improve performance in
parallel-server systems. In the baseline scenario any job can be handled
equally well by any server, and is replicated to a fixed number of servers
selected uniformly at random. Quite often however, there may be heterogeneity
in job characteristics or server capabilities, and jobs can only be replicated
to specific servers because of affinity relations or compatibility constraints.
In order to capture such situations, we consider a scenario where jobs of
various types are replicated to different subsets of servers as prescribed by a
general compatibility graph. We exploit a product-form stationary distribution
and weak local stability conditions to establish a state space collapse in
heavy traffic. In this limiting regime, the parallel-server system with
graph-based redundancy scheduling operates as a multi-class single-server
system, achieving full resource pooling and exhibiting strong insensitivity to
the underlying compatibility constraints.Comment: 28 pages, 4 figure
- …