101 research outputs found

    Resource pooling games

    Get PDF

    Job-Replication Trade-Offs:Performance Analysis of Redundancy Systems

    Get PDF

    Dynamical Modeling of Cloud Applications for Runtime Performance Management

    Get PDF
    Cloud computing has quickly grown to become an essential component in many modern-day software applications. It allows consumers, such as a provider of some web service, to quickly and on demand obtain the necessary computational resources to run their applications. It is desirable for these service providers to keep the running cost of their cloud application low while adhering to various performance constraints. This is made difficult due to the dynamics imposed by, e.g., resource contentions or changing arrival rate of users, and the fact that there exist multiple ways of influencing the performance of a running cloud application. To facilitate decision making in this environment, performance models can be introduced that relate the workload and different actions to important performance metrics.In this thesis, such performance models of cloud applications are studied. In particular, we focus on modeling using queueing theory and on the fluid model for approximating the often intractable dynamics of the queue lengths. First, existing results on how the fluid model can be obtained from the mean-field approximation of a closed queueing network are simplified and extended to allow for mixed networks. The queues are allowed to follow the processor sharing or delay disciplines, and can have multiple classes with phase-type service times. An improvement to this fluid model is then presented to increase accuracy when the \emph{system size}, i.e., number of servers, initial population, and arrival rate, is small. Furthermore, a closed-form approximation of the response time CDF is presented. The methods are tested in a series of simulation experiments and shown to be accurate. This mean-field fluid model is then used to derive a general fluid model for microservices with interservice delays. The model is shown to be completely extractable at runtime in a distributed fashion. It is further evaluated on a simple microservice application and found to accurately predict important performance metrics in most cases. Furthermore, a method is devised to reduce the cost of a running application by tuning load balancing parameters between replicas. The method is built on gradient stepping by applying automatic differentiation to the fluid model. This allows for arbitrarily defined cost functions and constraints, most notably including different response time percentiles. The method is tested on a simple application distributed over multiple computing clusters and is shown to reduce costs while adhering to percentile constraints. Finally, modeling of request cloning is studied using the novel concept of synchronized service. This allows certain forms of cloning over servers, each modeled with a single queue, to be equivalently expressed as one single queue. The concept is very general regarding the involved queueing discipline and distributions, but instead introduces new, less realistic assumptions. How the equivalent queue model is affected by relaxing these assumptions is studied considering the processor sharing discipline, and an extension to enable modeling of speculative execution is made. In a simulation campaign, it is shown that these relaxations only has a minor effect in certain cases

    Empirical Studies in Hospital Emergency Departments

    Get PDF
    This dissertation focuses on the operational impacts of crowding in hospital emergency departments. The body of this work is comprised of three essays. In the first essay, Waiting Patiently: An Empirical Study of Queue Abandonment in an Emergency Department, we study queue abandonment, or left without being seen. We show that abandonment is not only influenced by wait time, but also by the queue length and the observable queue flows during the waiting exposure. We show that patients are sensitive to being jumped in the line and that patients respond differently to people more sick and less sick moving through the system. This study shows that managers have an opportunity to impact abandonment behavior by altering what information is available to waiting customers. In the second essay, Doctors Under Load: An Empirical Study of State-Dependent Service Times in Emergency Care, we show that when crowded, multiple mechanisms in the emergency department act to retard patient treatment, but care providers adjust their clinical behavior to accelerate the service. We identify two mechanisms that providers use to accelerate the system: early task initiation and task reduction. In contrast to other recent works, we find the net effect of these countervailing forces to be an increase in service time when the system is crowded. Further, we use simulation to show that ignoring state-dependent service times leads to modeling errors that could cause hospitals to overinvest in human and physical resources. In the final essay, The Financial Consequences of Lost Demand and Reducing Boarding in Hospital Emergency Departments, we use discrete event simulation to estimate the number of patients lost to Left Without Being Seen and ambulance diversion as a result of patients waiting in the emergency department for an inpatient bed (known as boarding). These lost patients represent both a failure of the emergency department to meet the needs of those seeking care and lost revenue for the hospital. We show that dynamic bed management policies that proactively cancel some non-emergency patients when the hospital is near capacity can lead to reduced boarding, increased number of patients served, and increased hospital revenue

    Scheduling for today’s computer systems: bridging theory and practice

    Get PDF
    Scheduling is a fundamental technique for improving performance in computer systems. From web servers to routers to operating systems, how the bottleneck device is scheduled has an enormous impact on the performance of the system as a whole. Given the immense literature studying scheduling, it is easy to think that we already understand enough about scheduling. But, modern computer system designs have highlighted a number of disconnects between traditional analytic results and the needs of system designers. In particular, the idealized policies, metrics, and models used by analytic researchers do not match the policies, metrics, and scenarios that appear in real systems. The goal of this thesis is to take a step towards modernizing the theory of scheduling in order to provide results that apply to today’s computer systems, and thus ease the burden on system designers. To accomplish this goal, we provide new results that help to bridge each of the disconnects mentioned above. We will move beyond the study of idealized policies by introducing a new analytic framework where the focus is on scheduling heuristics and techniques rather than individual policies. By moving beyond the study of individual policies, our results apply to the complex hybrid policies that are often used in practice. For example, our results enable designers to understand how the policies that favor small job sizes are affected by the fact that real systems only have estimates of job sizes. In addition, we move beyond the study of mean response time and provide results characterizing the distribution of response time and the fairness of scheduling policies. These results allow us to understand how scheduling affects QoS guarantees and whether favoring small job sizes results in large job sizes being treated unfairly. Finally, we move beyond the simplified models traditionally used in scheduling research and provide results characterizing the effectiveness of scheduling in multiserver systems and when users are interactive. These results allow us to answer questions about the how to design multiserver systems and how to choose a workload generator when evaluating new scheduling designs
    • …
    corecore