6,757 research outputs found

    A Matrix-Analytic Solution for Randomized Load Balancing Models with Phase-Type Service Times

    Full text link
    In this paper, we provide a matrix-analytic solution for randomized load balancing models (also known as \emph{supermarket models}) with phase-type (PH) service times. Generalizing the service times to the phase-type distribution makes the analysis of the supermarket models more difficult and challenging than that of the exponential service time case which has been extensively discussed in the literature. We first describe the supermarket model as a system of differential vector equations, and provide a doubly exponential solution to the fixed point of the system of differential vector equations. Then we analyze the exponential convergence of the current location of the supermarket model to its fixed point. Finally, we present numerical examples to illustrate our approach and show its effectiveness in analyzing the randomized load balancing schemes with non-exponential service requirements.Comment: 24 page

    A Non-Blocking Priority Queue for the Pending Event Set

    Get PDF
    The large diffusion of shared-memory multi-core machines has impacted the way Parallel Discrete Event Simulation (PDES) engines are built. While they were originally conceived as data-partitioned platforms, where each thread is in charge of managing a subset of simulation objects, nowadays the trend is to shift towards share-everything settings. In this scenario, any thread can (in principle) take care of CPU-dispatching pending events bound to whichever simulation object, which helps to fully share the load across the available CPU-cores. Hence, a fundamental aspect to be tackled is to provide an efficient globally-shared pending events’ set from which multiple worker threads can concurrently extract events to be processed, and into which they can concurrently insert new produced events to be processed in the future. To cope with this aspect, we present the design and implementation of a concurrent non-blocking pending events’ set data structure, which can be seen as a variant of a classical calendar queue. Early experimental data collected with a synthetic stress test are reported, showing excellent scalability of our proposal on a machine equipped with 32 CPU-cores

    Doubly Exponential Solution for Randomized Load Balancing Models with General Service Times

    Full text link
    In this paper, we provide a novel and simple approach to study the supermarket model with general service times. This approach is based on the supplementary variable method used in analyzing stochastic models extensively. We organize an infinite-size system of integral-differential equations by means of the density dependent jump Markov process, and obtain a close-form solution: doubly exponential structure, for the fixed point satisfying the system of nonlinear equations, which is always a key in the study of supermarket models. The fixed point is decomposited into two groups of information under a product form: the arrival information and the service information. based on this, we indicate two important observations: the fixed point for the supermarket model is different from the tail of stationary queue length distribution for the ordinary M/G/1 queue, and the doubly exponential solution to the fixed point can extensively exist even if the service time distribution is heavy-tailed. Furthermore, we analyze the exponential convergence of the current location of the supermarket model to its fixed point, and study the Lipschitz condition in the Kurtz Theorem under general service times. Based on these analysis, one can gain a new understanding how workload probing can help in load balancing jobs with general service times such as heavy-tailed service.Comment: 40 pages, 4 figure

    Super-Exponential Solution in Markovian Supermarket Models: Framework and Challenge

    Full text link
    Marcel F. Neuts opened a key door in numerical computation of stochastic models by means of phase-type (PH) distributions and Markovian arrival processes (MAPs). To celebrate his 75th birthday, this paper reports a more general framework of Markovian supermarket models, including a system of differential equations for the fraction measure and a system of nonlinear equations for the fixed point. To understand this framework heuristically, this paper gives a detailed analysis for three important supermarket examples: M/G/1 type, GI/M/1 type and multiple choices, explains how to derive the system of differential equations by means of density-dependent jump Markov processes, and shows that the fixed point may be simply super-exponential through solving the system of nonlinear equations. Note that supermarket models are a class of complicated queueing systems and their analysis can not apply popular queueing theory, it is necessary in the study of supermarket models to summarize such a more general framework which enables us to focus on important research issues. On this line, this paper develops matrix-analytical methods of Markovian supermarket models. We hope this will be able to open a new avenue in performance evaluation of supermarket models by means of matrix-analytical methods.Comment: Randomized load balancing, supermarket model, matrix-analytic method, super-exponential solution, density-dependent jump Markov process, Batch Markovian Arrival Process (BMAP), phase-type (PH) distribution, fixed poin

    Proactive Aging Mitigation in CGRAs through Utilization-Aware Allocation

    Full text link
    Resource balancing has been effectively used to mitigate the long-term aging effects of Negative Bias Temperature Instability (NBTI) in multi-core and Graphics Processing Unit (GPU) architectures. In this work, we investigate this strategy in Coarse-Grained Reconfigurable Arrays (CGRAs) with a novel application-to-CGRA allocation approach. By introducing important extensions to the reconfiguration logic and the datapath, we enable the dynamic movement of configurations throughout the fabric and allow overutilized Functional Units (FUs) to recover from stress-induced NBTI aging. Implementing the approach in a resource-constrained state-of-the-art CGRA reveals 2.2×2.2\times lifetime improvement with negligible performance overheads and less than 10%10\% increase in area.Comment: Please cite this as: M. Brandalero, B. N. Lignati, A. Carlos Schneider Beck, M. Shafique and M. H\"ubner, "Proactive Aging Mitigation in CGRAs through Utilization-Aware Allocation," 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 2020, pp. 1-6, doi: 10.1109/DAC18072.2020.921858

    Towards Optimality in Parallel Scheduling

    Full text link
    To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip rather than single core performance. In turn, modern jobs are often designed to run on any number of cores. However, to effectively leverage these multi-core chips, one must address the question of how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is an obvious tradeoff: allocating more cores to an individual job reduces the job's runtime, but in turn decreases the efficiency of the overall system. We ask how the system should schedule jobs across cores so as to minimize the mean response time over a stream of incoming jobs. To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. EQUI requires jobs to change their level of parallelization while they run. Since this is not possible for all workloads, we consider a class of "fixed-width" policies, which choose a single level of parallelization, k, to use for all jobs. We prove that, surprisingly, it is possible to achieve EQUI's performance without requiring jobs to change their levels of parallelization by using the optimal fixed level of parallelization, k*. We also show how to analytically derive the optimal k* as a function of the system load, the speedup curve, and the job size distribution. In the case where jobs may follow different speedup curves, finding a good scheduling policy is even more challenging. We find that policies like EQUI which performed well in the case of a single speedup function now perform poorly. We propose a very simple policy, GREEDY*, which performs near-optimally when compared to the numerically-derived optimal policy
    corecore