6,757 research outputs found
A Matrix-Analytic Solution for Randomized Load Balancing Models with Phase-Type Service Times
In this paper, we provide a matrix-analytic solution for randomized load
balancing models (also known as \emph{supermarket models}) with phase-type (PH)
service times. Generalizing the service times to the phase-type distribution
makes the analysis of the supermarket models more difficult and challenging
than that of the exponential service time case which has been extensively
discussed in the literature. We first describe the supermarket model as a
system of differential vector equations, and provide a doubly exponential
solution to the fixed point of the system of differential vector equations.
Then we analyze the exponential convergence of the current location of the
supermarket model to its fixed point. Finally, we present numerical examples to
illustrate our approach and show its effectiveness in analyzing the randomized
load balancing schemes with non-exponential service requirements.Comment: 24 page
A Non-Blocking Priority Queue for the Pending Event Set
The large diffusion of shared-memory multi-core machines has impacted the way Parallel Discrete Event Simulation (PDES) engines are built. While they were originally conceived as data-partitioned platforms, where each thread is in charge of managing a subset of simulation objects, nowadays the trend is to shift towards share-everything settings. In this scenario, any thread can (in principle) take care of CPU-dispatching pending events bound to whichever simulation object, which helps to fully share the load across the available CPU-cores. Hence, a fundamental aspect to be tackled is to provide an efficient globally-shared pending events’ set from which multiple worker threads can concurrently extract events to be processed, and into which they can concurrently insert new produced events to be processed in the future. To cope with this aspect, we present the design and implementation of a concurrent non-blocking pending events’ set data structure, which can be seen as a variant of a classical calendar queue. Early experimental data collected with a synthetic stress test are reported, showing excellent scalability of our proposal on a machine equipped with 32 CPU-cores
Doubly Exponential Solution for Randomized Load Balancing Models with General Service Times
In this paper, we provide a novel and simple approach to study the
supermarket model with general service times. This approach is based on the
supplementary variable method used in analyzing stochastic models extensively.
We organize an infinite-size system of integral-differential equations by means
of the density dependent jump Markov process, and obtain a close-form solution:
doubly exponential structure, for the fixed point satisfying the system of
nonlinear equations, which is always a key in the study of supermarket models.
The fixed point is decomposited into two groups of information under a product
form: the arrival information and the service information. based on this, we
indicate two important observations: the fixed point for the supermarket model
is different from the tail of stationary queue length distribution for the
ordinary M/G/1 queue, and the doubly exponential solution to the fixed point
can extensively exist even if the service time distribution is heavy-tailed.
Furthermore, we analyze the exponential convergence of the current location of
the supermarket model to its fixed point, and study the Lipschitz condition in
the Kurtz Theorem under general service times. Based on these analysis, one can
gain a new understanding how workload probing can help in load balancing jobs
with general service times such as heavy-tailed service.Comment: 40 pages, 4 figure
Super-Exponential Solution in Markovian Supermarket Models: Framework and Challenge
Marcel F. Neuts opened a key door in numerical computation of stochastic
models by means of phase-type (PH) distributions and Markovian arrival
processes (MAPs). To celebrate his 75th birthday, this paper reports a more
general framework of Markovian supermarket models, including a system of
differential equations for the fraction measure and a system of nonlinear
equations for the fixed point. To understand this framework heuristically, this
paper gives a detailed analysis for three important supermarket examples: M/G/1
type, GI/M/1 type and multiple choices, explains how to derive the system of
differential equations by means of density-dependent jump Markov processes, and
shows that the fixed point may be simply super-exponential through solving the
system of nonlinear equations. Note that supermarket models are a class of
complicated queueing systems and their analysis can not apply popular queueing
theory, it is necessary in the study of supermarket models to summarize such a
more general framework which enables us to focus on important research issues.
On this line, this paper develops matrix-analytical methods of Markovian
supermarket models. We hope this will be able to open a new avenue in
performance evaluation of supermarket models by means of matrix-analytical
methods.Comment: Randomized load balancing, supermarket model, matrix-analytic method,
super-exponential solution, density-dependent jump Markov process, Batch
Markovian Arrival Process (BMAP), phase-type (PH) distribution, fixed poin
Proactive Aging Mitigation in CGRAs through Utilization-Aware Allocation
Resource balancing has been effectively used to mitigate the long-term aging
effects of Negative Bias Temperature Instability (NBTI) in multi-core and
Graphics Processing Unit (GPU) architectures. In this work, we investigate this
strategy in Coarse-Grained Reconfigurable Arrays (CGRAs) with a novel
application-to-CGRA allocation approach. By introducing important extensions to
the reconfiguration logic and the datapath, we enable the dynamic movement of
configurations throughout the fabric and allow overutilized Functional Units
(FUs) to recover from stress-induced NBTI aging. Implementing the approach in a
resource-constrained state-of-the-art CGRA reveals lifetime
improvement with negligible performance overheads and less than increase
in area.Comment: Please cite this as: M. Brandalero, B. N. Lignati, A. Carlos
Schneider Beck, M. Shafique and M. H\"ubner, "Proactive Aging Mitigation in
CGRAs through Utilization-Aware Allocation," 2020 57th ACM/IEEE Design
Automation Conference (DAC), San Francisco, CA, USA, 2020, pp. 1-6, doi:
10.1109/DAC18072.2020.921858
Towards Optimality in Parallel Scheduling
To keep pace with Moore's law, chip designers have focused on increasing the
number of cores per chip rather than single core performance. In turn, modern
jobs are often designed to run on any number of cores. However, to effectively
leverage these multi-core chips, one must address the question of how many
cores to assign to each job. Given that jobs receive sublinear speedups from
additional cores, there is an obvious tradeoff: allocating more cores to an
individual job reduces the job's runtime, but in turn decreases the efficiency
of the overall system. We ask how the system should schedule jobs across cores
so as to minimize the mean response time over a stream of incoming jobs.
To answer this question, we develop an analytical model of jobs running on a
multi-core machine. We prove that EQUI, a policy which continuously divides
cores evenly across jobs, is optimal when all jobs follow a single speedup
curve and have exponentially distributed sizes. EQUI requires jobs to change
their level of parallelization while they run. Since this is not possible for
all workloads, we consider a class of "fixed-width" policies, which choose a
single level of parallelization, k, to use for all jobs. We prove that,
surprisingly, it is possible to achieve EQUI's performance without requiring
jobs to change their levels of parallelization by using the optimal fixed level
of parallelization, k*. We also show how to analytically derive the optimal k*
as a function of the system load, the speedup curve, and the job size
distribution.
In the case where jobs may follow different speedup curves, finding a good
scheduling policy is even more challenging. We find that policies like EQUI
which performed well in the case of a single speedup function now perform
poorly. We propose a very simple policy, GREEDY*, which performs near-optimally
when compared to the numerically-derived optimal policy
- …