47 research outputs found

    Sensitivity analysis and simulation of a multiserver queueing system with mixed service time distribution

    Get PDF
    The motivation of mixing distributions in communication/queueing systems modeling is that some input data (e.g., service time in queueing models) may follow several distinct distributions in a single input flow. In this paper, we study the sensitivity of performance measures on proximity of the service time distributions of a multiserver system model with two-component Pareto mixture distribution of service times. The theoretical results are illustrated by numerical simulation of the M/G/c systems while using the perfect sampling approach

    Simple and explicit bounds for multi-server queues with 1/(1ρ)1/(1 - \rho) (and sometimes better) scaling

    Full text link
    We consider the FCFS GI/GI/nGI/GI/n queue, and prove the first simple and explicit bounds that scale as 11ρ\frac{1}{1-\rho} (and sometimes better). Here ρ\rho denotes the corresponding traffic intensity. Conceptually, our results can be viewed as a multi-server analogue of Kingman's bound. Our main results are bounds for the tail of the steady-state queue length and the steady-state probability of delay. The strength of our bounds (e.g. in the form of tail decay rate) is a function of how many moments of the inter-arrival and service distributions are assumed finite. More formally, suppose that the inter-arrival and service times (distributed as random variables AA and SS respectively) have finite rrth moment for some r>2.r > 2. Let μA\mu_A (respectively μS\mu_S) denote 1E[A]\frac{1}{\mathbb{E}[A]} (respectively 1E[S]\frac{1}{\mathbb{E}[S]}). Then our bounds (also for higher moments) are simple and explicit functions of E[(AμA)r],E[(SμS)r],r\mathbb{E}\big[(A \mu_A)^r\big], \mathbb{E}\big[(S \mu_S)^r\big], r, and 11ρ\frac{1}{1-\rho} only. Our bounds scale gracefully even when the number of servers grows large and the traffic intensity converges to unity simultaneously, as in the Halfin-Whitt scaling regime. Some of our bounds scale better than 11ρ\frac{1}{1-\rho} in certain asymptotic regimes. More precisely, they scale as 11ρ\frac{1}{1-\rho} multiplied by an inverse polynomial in n(1ρ)2.n(1 - \rho)^2. These results formalize the intuition that bounds should be tighter in light traffic as well as certain heavy-traffic regimes (e.g. with ρ\rho fixed and nn large). In these same asymptotic regimes we also prove bounds for the tail of the steady-state number in service. Our main proofs proceed by explicitly analyzing the bounding process which arises in the stochastic comparison bounds of amarnik and Goldberg for multi-server queues. Along the way we derive several novel results for suprema of random walks and pooled renewal processes which may be of independent interest. We also prove several additional bounds using drift arguments (which have much smaller pre-factors), and make several conjectures which would imply further related bounds and generalizations

    EUROPEAN CONFERENCE ON QUEUEING THEORY 2016

    Get PDF
    International audienceThis booklet contains the proceedings of the second European Conference in Queueing Theory (ECQT) that was held from the 18th to the 20th of July 2016 at the engineering school ENSEEIHT, Toulouse, France. ECQT is a biannual event where scientists and technicians in queueing theory and related areas get together to promote research, encourage interaction and exchange ideas. The spirit of the conference is to be a queueing event organized from within Europe, but open to participants from all over the world. The technical program of the 2016 edition consisted of 112 presentations organized in 29 sessions covering all trends in queueing theory, including the development of the theory, methodology advances, computational aspects and applications. Another exciting feature of ECQT2016 was the institution of the Takács Award for outstanding PhD thesis on "Queueing Theory and its Applications"

    Topics in Modeling and Analysis of Low-Latency Systems

    Get PDF
    Cloud-based architectures have become integral elements of modern networking infrastructure and are characterized by a large number of servers operating in parallel. Optimizing performance in these systems, with a particular focus on specific metrics such as system response time and the probability of loss, is critical to ensure user satisfaction. To address this challenge, this thesis analyzes load balancing policies that are designed to efficiently assign incoming user requests to the servers such that the system performance is optimized. In particular, the thesis focuses on a specialized category known as "randomized dynamic load balancing policies". These policies optimize system performance by dynamically adapting assignment decisions based on the current state of the system while interacting with a randomly selected subset of servers. Given the complex interdependencies among servers and the large size of these systems, an exact analysis of these systems is intractable. Consequently, the thesis studies these systems in the system size limit. It employs relevant limit theorems, including mean-field techniques and Stein's approach, as crucial mathematical tools. Furthermore, the thesis evaluates the accuracy of these limits when applied to systems of finite size, providing valuable insights into the practical applicability of the proposed load balancing policies. Motivated by different types of user requests or jobs, the thesis focuses on two main job categories: single-server jobs which can only run on a single server to represent non-parallelizable requests, and multiserver jobs, which can run on multiple servers simultaneously modeling parallelizable requests. The first part of the thesis studies single-server jobs in a system comprising a large number of processor sharing servers operating in parallel, where servers have different processing speeds and unlimited queueing buffers. The objective is to design randomized load balancing policies that minimize the average response time of jobs. A novel policy is introduced that allocates incoming jobs to servers based on predefined thresholds, state information from a randomly sampled subset of servers, and their processing speeds. The policy subsumes a broad class of other load balancing policies by adjusting the threshold levels, offering a unified framework for concurrent analysis of multiple load balancing policies. It is shown that under this policy, the system achieves the maximal stability region. Moreover, it is shown that as the system size approaches infinity, the transient and stationary stochastic occupancy measure of the system converges to a deterministic mean-field limit and the unique fixed point of this mean-field limit, respectively. As a result, the study of the asymptotic average response time of jobs becomes feasible through the fixed point of the mean-field limit. The analysis continues by studying error estimation related to asymptotic values in finite-sized systems. It is shown that when the mean delay of the finite-size system is approximated by its asymptotic value, the error is proportional to the inverse square root of the system size. Subsequently, the thesis analyzes adaptive multiserver jobs in loss systems, where they can be parallelized across a variable number of servers, up to a maximum degree of parallelization. In loss systems, each server can process only a finite number of jobs simultaneously and blocks any additional jobs beyond this capacity. Therefore, the goal is to devise randomized job assignment schemes that optimize the average response time of accepted jobs and the blocking probability while interacting with a sampled subset of servers. A load balancing policy is proposed, where the number of allocated servers for processing each job depends on the state information of a randomly sampled subset of servers and the maximum degree of parallelization. Employing Stein's method, it is shown that, provided that the sampling size grows at an appropriate rate, the difference between the steady-state system and a suitable deterministic system that exhibits optimality, decreases to zero as the system size increases. Thus, as the system size approaches infinity, the steady-state system achieves a zero blocking probability and optimal average response time for accepted jobs. Additionally, the thesis analyzes error estimation for these asymptotic values in finite-sized systems and establishes the error bounds as a function of the number of servers in the system

    Managing customer relationships through price and service quality

    Get PDF
    This paper examines the ways in which a service provider's policies on pricing and service level affect the size of its customer base and profitability. The analysis begins with the development of a customer behavior model that uses customer satisfaction and depth of relationship as mediators of the impact of price and service level on profitability. Based on this model of customer behavior, the system is analyzed as a queuing network from which the properties of the aggregate population's behavior are derived. The analysis reveals the counterintuitive result that a policy that involves a decrease in prices or an increase in service level may lead to a smaller customer base. However, this policy may also lead to higher profits. The novelty of this result lies in the explanation of the phenomenon: that when the customer base decreases due to a change in prices or service quality, companies may experience gains in profit that result not from a decrease in costs associated with serving fewer customers but from an increase in revenues resulting from the indirect effects of the lower prices or higher level of service on customer behavior. The application of optimization techniques to the model developed in this paper yields optimality conditions through which managers can assess the long-term profitability of their pricing and service-level policies.Customer relationship management; operations/marketing interface; two-part tariffs; service operations management; service quality;

    Scheduling for today’s computer systems: bridging theory and practice

    Get PDF
    Scheduling is a fundamental technique for improving performance in computer systems. From web servers to routers to operating systems, how the bottleneck device is scheduled has an enormous impact on the performance of the system as a whole. Given the immense literature studying scheduling, it is easy to think that we already understand enough about scheduling. But, modern computer system designs have highlighted a number of disconnects between traditional analytic results and the needs of system designers. In particular, the idealized policies, metrics, and models used by analytic researchers do not match the policies, metrics, and scenarios that appear in real systems. The goal of this thesis is to take a step towards modernizing the theory of scheduling in order to provide results that apply to today’s computer systems, and thus ease the burden on system designers. To accomplish this goal, we provide new results that help to bridge each of the disconnects mentioned above. We will move beyond the study of idealized policies by introducing a new analytic framework where the focus is on scheduling heuristics and techniques rather than individual policies. By moving beyond the study of individual policies, our results apply to the complex hybrid policies that are often used in practice. For example, our results enable designers to understand how the policies that favor small job sizes are affected by the fact that real systems only have estimates of job sizes. In addition, we move beyond the study of mean response time and provide results characterizing the distribution of response time and the fairness of scheduling policies. These results allow us to understand how scheduling affects QoS guarantees and whether favoring small job sizes results in large job sizes being treated unfairly. Finally, we move beyond the simplified models traditionally used in scheduling research and provide results characterizing the effectiveness of scheduling in multiserver systems and when users are interactive. These results allow us to answer questions about the how to design multiserver systems and how to choose a workload generator when evaluating new scheduling designs
    corecore