923 research outputs found

    Staffing and Scheduling to Differentiate Service in Many-Server Service Systems

    Get PDF
    This dissertation contributes to the study of a queueing system with a single pool of multiple homogeneous servers to which multiple classes of customers arrive in independent streams. The objective is to devise appropriate staffing and scheduling policies to achieve specified class-dependent service levels expressed in terms of tail probability of delays. Here staffing and scheduling are concerned with specifying a time-varying number of servers and assigning newly idle servers to a waiting customer from one of K classes, respectively. For this purpose, we propose new staffing-and-scheduling solutions under the critically-loaded and overloaded regimes. In both cases, the proposed solutions are both time dependent (coping with the time variability in the arrival pattern) and state dependent (capturing the stochastic variability in service and arrival times). We prove heavy-traffic limit theorems to substantiate the effectiveness of our proposed staffing and scheduling policies. We also conduct computer simulation experiments to provide engineering confirmation and practical insight

    Capacity Allocation for Clouds with Parallel Processing, Batch Arrivals, and Heterogeneous Service Requirements

    Full text link
    Problem Definition: Allocating sufficient capacity to cloud services is a challenging task, especially when demand is time-varying, heterogeneous, contains batches, and requires multiple types of resources for processing. In this setting, providers decide whether to reserve portions of their capacity to individual job classes or to offer it in a flexible manner. Methodology/results: In collaboration with Huawei Cloud, a worldwide provider of cloud services, we propose a heuristic policy that allocates multiple types of resources to jobs and also satisfies their pre-specified service level agreements (SLAs). We model the system as a multi-class queueing network with parallel processing and multiple types of resources, where arrivals (i.e., virtual machines and containers) follow time-varying patterns and require at least one unit of each resource for processing. While virtual machines leave if they are not served immediately, containers can join a queue. We introduce a diffusion approximation of the offered load of such system and investigate its fidelity as compared to the observed data. Then, we develop a heuristic approach that leverages this approximation to determine capacity levels that satisfy probabilistic SLAs in the system with fully flexible servers. Managerial Implications: Using a data set of cloud computing requests over a representative 8-day period from Huawei Cloud, we show that our heuristic policy results in a 20% capacity reduction and better service quality as compared to a benchmark that reserves resources. In addition, we show that the system utilization induced by our policy is superior to the benchmark, i.e., it implies less idling of resources in most instances. Thus, our approach enables cloud operators to both reduce costs and achieve better performance

    Overdispersion in service systems

    Get PDF

    Efficient Simulation and Performance Stabilization for Time-Varying Single-Server Queues

    Get PDF
    This thesis develops techniques to evaluate and to improve the performance of single-server service systems with time-varying arrivals. The performance measures considered are the time-varying expected length of the queue and the expected customer waiting time. Time varying arrival rates are considered because they often occur in service systems. For example, arrival rates often vary significantly over the hours of each day and over the days of each week. Stochastic textbook methods do not apply to models with time-varying arrival rates. Hence new techniques are needed to provide high quality of service when stationary steady-state analysis is not appropriate. In contrast to the extensive recent literature on many-server queues with time-varying arrival rates, we focus on single-server queues with time-varying arrival rates. Single-server queues arise in real applications where there is no flexibility in the number of service facilities (servers). Different analysis techniques are required for single-server queues, because the two kinds of models exhibit very different performance. Many-server models are more tractable because methods for highly tractable infinite-server models can be applied. In contrast, single-server models are more complicated because it takes a long time to respond to a build up of workload when there is only one server. The thesis is divided into two parts: simulation algorithms for performance evaluation and service-rate controls for performance stabilization. The first part of the thesis develops algorithms to efficiently simulate the single-server time-varying queue. For the generality considered, no explicit mathematical formulas are available for calculating performance measures, so simulation experiments are needed to calculate and evaluate system performance. Efficient algorithms for both standard simulation and rare-event simulation are developed. The second part of the thesis develops service-rate controls to stabilize performance in the time-varying single-server queue. The performance stabilization problem aims to minimize fluctuations in mean waiting times for customers coming at different times even though the arrival rate is time-varying. A new service rate control is developed, where the service rate at each time is a function of the arrival rate function. We show that a specific service rate control can be found to stabilize performance. In turn, that service rate control can be used to provide guidance for real applications on optimal changes in staffing, processing speed or machine power status over time. Both the simulation experiments to evaluate performance of alternative service-rate controls and the simulation search algorithm to find the best parameters for a damped time-lag service-rate control are based on efficient performance evaluation algorithms in the first part of the thesis. In Chapter Two, we present an efficient algorithm to simulate a general non-Poisson non-stationary point process. The general point process can be represented as a time transformation of a rate-one base process and by exploiting a table of the inverse cumulative arrival rate function outside of simulation, we can efficiently convert the simulated rate-one process into the simulated general point process. The simulation experiments can be conducted in linear time subject to small error bounds. Then we can apply this efficient algorithm to generate the arrival process, the service process and thus to calculate performance measures for the G_t/G_t/1 queues, which are single-server queues with time-varying arrival rates and service rates. Service models are constructed for this purpose where time-varying service rates are specified separately from the rate-one service requirement process, and service times are determined by equating service requirements with integrals of service rates over a time period equal to the service time. In Chapter Three, we develop rare-event simulation algorithms in periodic GI_t/GI/1 queues and further in GI_t/GI_t/1 queues to estimate probabilities of rare but important events as a sanity check of the system, for example, estimating the probability that the waiting time is very long. Importance sampling, specifically exponential tilting, is required to estimate rare-event probabilities because in standard simulation, the number of experiments may blow up to achieve a targeted relative error and for each experiment, it may take a very long time to determine that the rare event does not happen. To extend the rare-event simulation algorithm to periodic queues, we derive a convenient expression for the periodic steady-state virtual waiting time. We apply this expression to establish bounds between the periodic workload and the steady-state workload in stationary queues, so that we can prove that the exponential tilting algorithm with the same parameter efficient in stationary queues is efficient in the periodic setting as well, which has a bounded relative error. We apply this algorithm to compute the periodic steady-state distribution of reflected periodic Brownian motion with support of a heavy-traffic limit theorem and to calculate the periodic steady-state distribution and moments of the virtual waiting time. This algorithm's advantage in calculating these distributions and moments is that it can directly estimate them at a specific position of the cycle without simulating the whole queueing process until steady state is reached for the whole cycle. In Chapter Four, we conduct simulation experiments to validate performance of four service-rate controls: the rate-matching control, which is directly proportional to the arrival rate, two square-root controls related to the square root staffing formula and the square-root control based on the mean stationary waiting time. Simulations show that the rate-matching control stabilizes the queue length distribution but not the virtual waiting time. This is consistent with established theoretical results, which follow from the observation that with rate-matching control, the queueing process becomes a time transformation of the stationary queueing process with constant arrival rates and service rates. Simulation results also show that the two square-root controls analogous to the server staffing formula are not effective in stabilizing performance. On the other hand, the alternative square-root service rate control based on the mean stationary waiting time approximately stabilizes the virtual waiting time when the cycle is long so that the arrival rate changes slowly enough. In Chapter Five, since we are mostly interested in stabilizing waiting times in more common scenarios when the traffic intensity is not close to one or when the arrival rate does not change slowly, we develop a damped time-lag service-rate control that performs fairly well for this purpose. This control is a modification of the rate-matching control involving a time lag and a damping factor. To find the best parameters for this control, we search over reasonable intervals for the most time-stable performance measures, which are computed by the extended rare-event simulation algorithm in GI_t/GI_t/1 queue. We conduct simulation experiments to validate that this control is effective for stabilizing the expected steady-state virtual waiting time (and its distribution to a large extent). We also establish a heavy-traffic limit with periodicity in the fluid scale to provide theoretical support for this control. We also show that there is a time-varying Little's law in heavy-traffic, which implies that this control cannot stabilize the queue length and the waiting time at the same time

    A Simulation-Based Analysis of the Impact of In-Sourcing a Major Process Element on the Coast Guard HH-601 Depot Maintenance Process

    Get PDF
    Leaders at the United States Coast Guard\u27s Aircraft Repair and Service Center (ARSC) in Elizabeth City, North Carolina recently formalized their planning and analysis functions by adding a dedicated branch to their command structure. The Planning and Analysis Branch intends to apply computer modeling and simulation to study the impact of process changes to the various Programmed Depot Maintenance (PDM) lines. This research considers the applicability of this type of modeling and simulation, using ARENA to study the current HH-6OJ PDM process. The contribution of this research is a methodology specific to ARSC needs, an analysis of methodology based on a discrete event simulation model of PDM lines, and a specific case study demonstrating the methodologies. The response variable of interest is average PDM process time as a function of either in-sourcing or out-sourcing labor for a major process step. The research includes development and evaluation of a macro-level process model using ARENA 5.0

    Simulation Modeling and Analysis of Adjustable Service-Rate Queueing Models that Incorporate Feedback Control

    Get PDF
    Research shows that in a system model, when the production rate is adjusted based on the number of items in queue, the nature of the model changes from an open-loop queueing system to a closed-loop feedback control system. Service-rate adjustment can be implemented in a discrete event simulation model, but the effect of this adjustment has not been thoroughly analyzed in the literature. This research considers the design of feedback signals to generate realistic simulation models of production system behavior. A series of simulation experiments is conducted to provide practical guidance for simulation modelers on how adding a service-rate adjustment feedback loop to a queueing system affects system performance
    • …
    corecore