Search CORE

266 research outputs found

Performance modeling and control of web servers

Author: Andersson Mikael
Publication venue: Lund Institute of Technology
Publication date: 01/01/2004
Field of study

This thesis deals with the task of modeling a web server and designing a mechanism that can prevent the web server from being overloaded. Four papers are presented. The ﬁrst paper gives an M/G/1/K processor sharing model of a single web server. The model is validated against measurements ands imulations on the commonly usedw eb server Apache. A description is given on how to calculate the necessary parameters in the model. The second paper introduces an admission control mechanism for the Apache web server basedon a combination of queuing theory andcon trol theory. The admission control mechanism is tested in the laboratory, implemented as a stand-alone application in front of the web server. The third paper continues the work from the secondp aper by discussing stability. This time, the admission control mechanism is implemented as a module within the Apache source code. Experiments show the stability and settling time of the controller. Finally, the fourth paper investigates the concept of service level agreements for a web site. The agreements allow a maximum response time anda minimal throughput to be set. The requests are sorted into classes, where each class is assigneda weight (representing the income for the web site owner). Then an optimization algorithm is appliedso that the total proﬁt for the web site during overload is maximized

Lund University Publications

Control-theoretic Analysis of Admission Control Mechanisms for Web Server Systems

Author: Andersson Mikael
Kihl Maria
Robertsson Anders
Wittenmark Björn
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Web sites are exposed to high rates of incoming requests. The servers may become overloaded during temporary traffic peaks when more requests arrive than the server is designed for. An admission control mechanism rejects some requests whenever the arriving traffic is too high and thereby maintains an acceptable load in the system. This paper presents how admission control mechanisms can be designed with a combination of queueing theory and control theory. In this paper we model an Apache web server as a GI/G/1-system and then design a PI-controller, commonly used in automatic control, for the server. The controller has been implemented as a module inside the Apache source code. Measurements from the laboratory setup show how robust the implemented controller is, and how it corresponds to the results from the theoretical analysis

Lund University Publications

Towards Autonomic Service Provisioning Systems

Author: Mazzucco Michele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

This paper discusses our experience in building SPIRE, an autonomic system for service provision. The architecture consists of a set of hosted Web Services subject to QoS constraints, and a certain number of servers used to run session-based traffic. Customers pay for having their jobs run, but require in turn certain quality guarantees: there are different SLAs specifying charges for running jobs and penalties for failing to meet promised performance metrics. The system is driven by an utility function, aiming at optimizing the average earned revenue per unit time. Demand and performance statistics are collected, while traffic parameters are estimated in order to make dynamic decisions concerning server allocation and admission control. Different utility functions are introduced and a number of experiments aiming at testing their performance are discussed. Results show that revenues can be dramatically improved by imposing suitable conditions for accepting incoming traffic; the proposed system performs well under different traffic settings, and it successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures, http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636

arXiv.org e-Print Archive

Crossref

AWAIT: Efficient Overload Management for Busy Multi-tier Web Services under Bursty Workloads

Author: Cherkasova Ludmila
Lu Lei
Mi Ningfang
Persone Vittoria de Nitto
Smirni Evgenia
Publication venue: W&M ScholarWorks
Publication date: 01/01/2010
Field of study

The problem of service differentiation and admission control in web services that utilize a multi-tier architecture is more challenging than in a single-tiered one, especially in the presence of bursty conditions, i.e., when arrivals of user web sessions to the system are characterized by temporal surges in their arrival intensities and demands. We demonstrate that classic techniques for a session based admission control that are triggered by threshold violations are ineffective under bursty workload conditions, as user-perceived performance metrics rapidly and dramatically deteriorate, inadvertently leading the system to reject requests from already accepted user sessions, resulting in business loss. Here, as a solution for service differentiation of accepted user sessions we promote a methodology that is based on blocking, i.e., when the system operates in overload, requests from accepted sessions are not rejected but are instead stored in a blocking queue that effectively acts as a waiting room. The requests in the blocking queue implicitly become of higher priority and are served immediately after load subsides. Residence in the blocking queue comes with a performance cost as blocking time adds to the perceived end-to-end user response time. We present a novel autonomic session based admission control policy, called AWAIT, that adaptively adjusts the capacity of the blocking queue as a function of workload burstiness in order to meet predefined user service level objectives while keeping the portion of aborted accepted sessions to a minimum. Detailed simulations illustrate the effectiveness of AWAIT under different workload burstiness profiles and therefore strongly argue for its effectiveness

Crossref

ART

College of William & Mary: W&M Publish

Effective Resource and Workload Management in Data Centers

Author: Lu Lei
Publication venue: W&M ScholarWorks
Publication date: 01/01/2014
Field of study

The increasing demand for storage, computation, and business continuity has driven the growth of data centers. Managing data centers efficiently is a difficult task because of the wide variety of datacenter applications, their ever-changing intensities, and the fact that application performance targets may differ widely. Server virtualization has been a game-changing technology for IT, providing the possibility to support multiple virtual machines (VMs) simultaneously. This dissertation focuses on how virtualization technologies can be utilized to develop new tools for maintaining high resource utilization, for achieving high application performance, and for reducing the cost of data center management.;For multi-tiered applications, bursty workload traffic can significantly deteriorate performance. This dissertation proposes an admission control algorithm AWAIT, for handling overloading conditions in multi-tier web services. AWAIT places on hold requests of accepted sessions and refuses to admit new sessions when the system is in a sudden workload surge. to meet the service-level objective, AWAIT serves the requests in the blocking queue with high priority. The size of the queue is dynamically determined according to the workload burstiness.;Many admission control policies are triggered by instantaneous measurements of system resource usage, e.g., CPU utilization. This dissertation first demonstrates that directly measuring virtual machine resource utilizations with standard tools cannot always lead to accurate estimates. A directed factor graph (DFG) model is defined to model the dependencies among multiple types of resources across physical and virtual layers.;Virtualized data centers always enable sharing of resources among hosted applications for achieving high resource utilization. However, it is difficult to satisfy application SLOs on a shared infrastructure, as application workloads patterns change over time. AppRM, an automated management system not only allocates right amount of resources to applications for their performance target but also adjusts to dynamic workloads using an adaptive model.;Server consolidation is one of the key applications of server virtualization. This dissertation proposes a VM consolidation mechanism, first by extending the fair load balancing scheme for multi-dimensional vector scheduling, and then by using a queueing network model to capture the service contentions for a particular virtual machine placement

College of William & Mary: W&M Publish

Resource allocation and disturbance rejection in web servers using SLAs and virtualized servers

Author: Kihl Maria
Kjaer Martin Ansbjerg
Robertsson Anders
Publication venue: IEEE - Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2009
Field of study

Abstract—Resource management in IT–enterprises gain more and more attention due to high operation costs. For instance, web sites are subject to very changing traffic–loads over the year, over the day, or even over the minute. Online adaption to the changing environment is one way to reduce losses in the operation. Control systems based on feedback provide methods for such adaption, but is in nature slow, since changes in the environment has to propagate through the system before being compensated. Therefore, feed–forward systems can be introduced that has shown to improve the transient performance. However, earlier proposed feed–forward systems have been based on offline estimation. In this article we show that off–line estimations can be problematic in online applications. Therefore, we propose a method where parameters are estimated online, and thus also adapts to the changing environment. We compare our solution to two other control strategies proposed in the literature, which are based on off-line estimation of certain parameters. We evaluate the controllers with both discrete-event simulations and experiments in our testbed. The investigations show the strength of our proposed control system

Lund University Publications

Revenue maximization problems in commercial data centers

Author: Mazzucco Michele
Publication venue: Newcastle University
Publication date: 01/01/2009
Field of study

PhD ThesisAs IT systems are becoming more important everyday, one of the main concerns is that users may face major problems and eventually incur major costs if computing systems do not meet the expected performance requirements: customers expect reliability and performance guarantees, while underperforming systems loose revenues. Even with the adoption of data centers as the hub of IT organizations and provider of business efficiencies the problems are not over because it is extremely difficult for service providers to meet the promised performance guarantees in the face of unpredictable demand. One possible approach is the adoption of Service Level Agreements (SLAs), contracts that specify a level of performance that must be met and compensations in case of failure. In this thesis I will address some of the performance problems arising when IT companies sell the service of running ‘jobs’ subject to Quality of Service (QoS) constraints. In particular, the aim is to improve the efficiency of service provisioning systems by allowing them to adapt to changing demand conditions. First, I will define the problem in terms of an utility function to maximize. Two different models are analyzed, one for single jobs and the other useful to deal with session-based traffic. Then, I will introduce an autonomic model for service provision. The architecture consists of a set of hosted applications that share a certain number of servers. The system collects demand and performance statistics and estimates traffic parameters. These estimates are used by management policies which implement dynamic resource allocation and admission algorithms. Results from a number of experiments show that the performance of these heuristics is close to optimal.QoSP (Quality of Service Provisioning)British Teleco

Newcastle University eTheses

ASIdE: Using Autocorrelation-Based Size Estimation for Scheduling Bursty Workloads.

Author: Casale G
Mi N
Smirni E
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/05/2012
Field of study

Temporal dependence in workloads creates peak congestion that can make service unavailable and reduce system performance. To improve system performability under conditions of temporal dependence, a server should quickly process bursts of requests that may need large service demands. In this paper, we propose and evaluateASIdE, an Autocorrelation-based SIze Estimation, that selectively delays requests which contribute to the workload temporal dependence. ASIdE implicitly approximates the shortest job first (SJF) scheduling policy but without any prior knowledge of job service times. Extensive experiments show that (1) ASIdE achieves good service time estimates from the temporal dependence structure of the workload to implicitly approximate the behavior of SJF; and (2) ASIdE successfully counteracts peak congestion in the workload and improves system performability under a wide variety of settings. Specifically, we show that system capacity under ASIdE is largely increased compared to the first-come first-served (FCFS) scheduling policy and is highly-competitive with SJF. © 2012 IEEE

Spiral - Imperial College Digital Repository

Disturbance Rejection and Control in Web Servers

Author: Kjaer Martin Ansbjerg
Publication venue: Department of Automatic Control, Lund Institute of Technology, Lund University
Publication date: 01/01/2009
Field of study

An important factor for a user of web sites on the Internet is the duration of time between the request of a web page until an answer has been returned. If this response time is too long, the user is likely to abandon the web site and search for other providers of the service. To avoid this loss of users, it is important for the web site operator to assure that users are treated sufficiently fast. On the other hand, it is also important to minimize the effort to optimize profit. As these objectives often are contradictory, an acceptable target response-time that can be formulated. The resources are allocated in a manner that ensures that long response times do not occur, while, at the same time, using as little resources as possible to not overprovision. The work presented in this doctoral thesis takes a control-theoretic perspective to solve this problem. The resources are considered as the control input, and the response time as the main output. Several disturbances affect the system, such as the arrival rate of requests to the web site. A testbed was designed to allow repeatable experiments with different controller implementations. A server was instrumented with sensors and actuators to handle requests from 12 client computers with capability for changing work loads. On the theoretical side, a model of a web server is presented in this thesis. It explicitly models a specific sensor implementation where buffering occurs in the computer prior to the sensor. As a result, the measurement of the arrival rate becomes state dependent under high load. This property turns out to have some undesirable effects on the controlled system. The model was capable of predicting the behavior of the testbed quite well. Based on the presented model, analysis shows that feed-forward controllers suggested in the literature can lead to instability under certain circumstances at high load. This has not been reported earlier, but is in this doctoral thesis demonstrated by both simulations and experiments. The analysis explains why and when the instability arises. In the attempt to predict future response-times this thesis also presents a feedback based prediction scheme. Comparisons between earlier predictions to the real response-times are used to correct a model based response time prediction. The prediction scheme is applied to a controller to compensate for disturbances before the effect propagates to the response time. The method improves the transient response in the case of sudden changes in the arrival rate of requests. This doctoral thesis also presents work on a control solution for reserving CPU capacity for a given process or a given group of processes on a computer system. The method uses only existing operating-system infrastructure, and achieves the desired CPU capacity in a soft real-time manner

Lund University Publications

Recommended from our members

The Design and Implementation of Low-Latency Prediction Serving Systems

Author: Crankshaw Daniel
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Machine learning is being deployed in a growing number of applications which demand real- time, accurate, and cost-efficient predictions under heavy query load. These applications employ a variety of machine learning frameworks and models, often composing several models within the same application. However, most machine learning frameworks and systems are optimized for model training and not deployment.In this thesis, I discuss three prediction serving systems designed to meet the needs of modern interactive machine learning applications. The key idea in this work is to utilize a decoupled, layered design that interposes systems on top of training frameworks to build low-latency, scalable serving systems. Velox introduced this decoupled architecture to enable fast online learning and model personalization in response to feedback. Clipper generalized this system architecture to be framework-agnostic and introduced a set of optimizations to reduce and bound prediction latency and improve prediction throughput, accuracy, and robustness without modifying the underlying machine learning frameworks. And InferLine provisions and manages the individual stages of prediction pipelines to minimize cost while meeting end-to-end tail latency constraints

eScholarship - University of California