6 research outputs found
Selective early request termination for busy internet services
Internet traffic is bursty and network servers are often overloaded with surprising events or abnormal client request patterns. This paper studies a load shedding mechanism called selective early request termination (SERT) for network services that use threads to handle multiple incoming requests continuously and concurrently. Our investigation with applications from Ask.com shows that during overloaded situations, a relatively small percentage of long requests that require excessive computing resource can dramatically affect other short requests and reduce the overall system throughput. By actively detecting and aborting overdue long requests, services can perform significantly better to achieve QoS objectives compared to a purely admission based approach. We have proposed a termination scheme that monitors running time of requests, accounts for their resource usage, adaptively adjusts the selection threshold, and performs a safe termination for a class of requests. This paper presents the design and implementation of this scheme and describes experimental results to validate the proposed approach
An adaptive admission control and load balancing algorithm for a QoS-aware Web system
The main objective of this thesis focuses on the design of an adaptive algorithm for admission control and content-aware load balancing for Web traffic. In order to set the context of this work, several reviews are included to introduce the reader in the background concepts of Web load balancing, admission control and the Internet traffic characteristics that may affect the good performance of a Web site. The admission control and load balancing algorithm described in this thesis manages the distribution of traffic to a Web cluster based on QoS requirements. The goal of the proposed scheduling algorithm is to avoid situations in which the system provides a lower performance than desired due to servers' congestion. This is achieved through the implementation of forecasting calculations. Obviously, the increase of the computational cost of the algorithm results in some overhead. This is the reason for designing an adaptive time slot scheduling that sets the execution times of the algorithm depending on the burstiness that is arriving to the system. Therefore, the predictive scheduling algorithm proposed includes an adaptive overhead control. Once defined the scheduling of the algorithm, we design the admission control module based on throughput predictions. The results obtained by several throughput predictors are compared and one of them is selected to be included in our algorithm. The utilisation level that the Web servers will have in the near future is also forecasted and reserved for each service depending on the Service Level Agreement (SLA). Our load balancing strategy is based on a classical policy. Hence, a comparison of several classical load balancing policies is also included in order to know which of them better fits our algorithm. A simulation model has been designed to obtain the results presented in this thesis
Predictive dynamic resource allocation for web hosting environments
E-Business applications are subject to significant variations in workload and this can
cause exceptionally long response times for users, the timing out of client requests
and/or the dropping of connections. One solution is to host these applications in virtualised
server pools, and to dynamically reassign compute servers between pools to
meet the demands on the hosted applications. Switching servers between pools is not
without cost, and this must therefore be weighed against possible system gain.
This work is concerned with dynamic resource allocation for multi-tiered, clusterbased
web hosting environments. Dynamic resource allocation is reactive, that is, when
overloading occurs in one resource pool, servers are moved from another (quieter) pool
to meet this demand. Switching servers comes with some overhead, so it is important
to weigh up the costs of the switch against possible system gains. In this thesis we
combine the reactive behaviour of two server switching policies – the Proportional
Switching Policy (PSP) and the Bottleneck Aware Switching Policy (BSP) – with the
proactive properties of several workload forecasting models.
We evaluate the behaviour of the two switching policies and compare them against
static resource allocation under a range of reallocation intervals (the time it takes to
switch a server from one resource pool to another) and observe that larger reallocation
intervals have a negative impact on revenue. We also construct model- and simulation-based environments in which the combination of workload prediction and dynamic
server switching can be explored. Several different (but common) predictors – Last
Observation (LO), Simple Average (SA), Sample Moving Average (SMA) and Exponential
Moving Average (EMA), Low Pass Filter (LPF), and an AutoRegressive Integrated
Moving Average (ARIMA) – have been applied alongside the switching policies.
As each of the forecasting schemes has its own bias, we also develop a number of
meta-forecasting algorithms – the Active Window Model (AWM), the Voting Model
(VM), the Selective Model (SM), the Dynamic Active Window Model (DAWM), and
a method based on Workload Pattern Analysis (WPA). The schemes are tested with
real-world workload traces from several sources to ensure consistent and improved results.
We also investigate the effectiveness of these schemes on workloads containing
extreme events (e.g. flash crowds). The results show that workload forecasting can be
very effective when applied alongside dynamic resource allocation strategies
Revenue maximization problems in commercial data centers
PhD ThesisAs IT systems are becoming more important everyday, one of the main concerns is that users may
face major problems and eventually incur major costs if computing systems do not meet the expected
performance requirements: customers expect reliability and performance guarantees, while
underperforming systems loose revenues. Even with the adoption of data centers as the hub of
IT organizations and provider of business efficiencies the problems are not over because it is extremely
difficult for service providers to meet the promised performance guarantees in the face of
unpredictable demand. One possible approach is the adoption of Service Level Agreements (SLAs),
contracts that specify a level of performance that must be met and compensations in case of failure.
In this thesis I will address some of the performance problems arising when IT companies sell
the service of running ‘jobs’ subject to Quality of Service (QoS) constraints. In particular, the aim
is to improve the efficiency of service provisioning systems by allowing them to adapt to changing
demand conditions.
First, I will define the problem in terms of an utility function to maximize. Two different models
are analyzed, one for single jobs and the other useful to deal with session-based traffic. Then,
I will introduce an autonomic model for service provision. The architecture consists of a set of
hosted applications that share a certain number of servers. The system collects demand and performance
statistics and estimates traffic parameters. These estimates are used by management policies
which implement dynamic resource allocation and admission algorithms. Results from a number of
experiments show that the performance of these heuristics is close to optimal.QoSP (Quality of Service Provisioning)British Teleco
Revenue maximization problems in commercial data centers
As IT systems are becoming more important everyday, one of the main concerns is that users may face major problems and eventually incur major costs if computing systems do not meet the expected performance requirements: customers expect reliability and performance guarantees, while underperforming systems loose revenues. Even with the adoption of data centers as the hub of IT organizations and provider of business efficiencies the problems are not over because it is extremely difficult for service providers to meet the promised performance guarantees in the face of unpredictable demand. One possible approach is the adoption of Service Level Agreements (SLAs), contracts that specify a level of performance that must be met and compensations in case of failure. In this thesis I will address some of the performance problems arising when IT companies sell the service of running ‘jobs’ subject to Quality of Service (QoS) constraints. In particular, the aim is to improve the efficiency of service provisioning systems by allowing them to adapt to changing demand conditions. First, I will define the problem in terms of an utility function to maximize. Two different models are analyzed, one for single jobs and the other useful to deal with session-based traffic. Then, I will introduce an autonomic model for service provision. The architecture consists of a set of hosted applications that share a certain number of servers. The system collects demand and performance statistics and estimates traffic parameters. These estimates are used by management policies which implement dynamic resource allocation and admission algorithms. Results from a number of experiments show that the performance of these heuristics is close to optimal.EThOS - Electronic Theses Online ServiceQoSP (Quality of Service Provisioning) : British TelecomGBUnited Kingdo