6 research outputs found

    Selective early request termination for busy internet services

    No full text
    Internet traffic is bursty and network servers are often overloaded with surprising events or abnormal client request patterns. This paper studies a load shedding mechanism called selective early request termination (SERT) for network services that use threads to handle multiple incoming requests continuously and concurrently. Our investigation with applications from Ask.com shows that during overloaded situations, a relatively small percentage of long requests that require excessive computing resource can dramatically affect other short requests and reduce the overall system throughput. By actively detecting and aborting overdue long requests, services can perform significantly better to achieve QoS objectives compared to a purely admission based approach. We have proposed a termination scheme that monitors running time of requests, accounts for their resource usage, adaptively adjusts the selection threshold, and performs a safe termination for a class of requests. This paper presents the design and implementation of this scheme and describes experimental results to validate the proposed approach

    Selective early request termination for busy internet services

    No full text

    An adaptive admission control and load balancing algorithm for a QoS-aware Web system

    Get PDF
    The main objective of this thesis focuses on the design of an adaptive algorithm for admission control and content-aware load balancing for Web traffic. In order to set the context of this work, several reviews are included to introduce the reader in the background concepts of Web load balancing, admission control and the Internet traffic characteristics that may affect the good performance of a Web site. The admission control and load balancing algorithm described in this thesis manages the distribution of traffic to a Web cluster based on QoS requirements. The goal of the proposed scheduling algorithm is to avoid situations in which the system provides a lower performance than desired due to servers' congestion. This is achieved through the implementation of forecasting calculations. Obviously, the increase of the computational cost of the algorithm results in some overhead. This is the reason for designing an adaptive time slot scheduling that sets the execution times of the algorithm depending on the burstiness that is arriving to the system. Therefore, the predictive scheduling algorithm proposed includes an adaptive overhead control. Once defined the scheduling of the algorithm, we design the admission control module based on throughput predictions. The results obtained by several throughput predictors are compared and one of them is selected to be included in our algorithm. The utilisation level that the Web servers will have in the near future is also forecasted and reserved for each service depending on the Service Level Agreement (SLA). Our load balancing strategy is based on a classical policy. Hence, a comparison of several classical load balancing policies is also included in order to know which of them better fits our algorithm. A simulation model has been designed to obtain the results presented in this thesis

    Predictive dynamic resource allocation for web hosting environments

    Get PDF
    E-Business applications are subject to significant variations in workload and this can cause exceptionally long response times for users, the timing out of client requests and/or the dropping of connections. One solution is to host these applications in virtualised server pools, and to dynamically reassign compute servers between pools to meet the demands on the hosted applications. Switching servers between pools is not without cost, and this must therefore be weighed against possible system gain. This work is concerned with dynamic resource allocation for multi-tiered, clusterbased web hosting environments. Dynamic resource allocation is reactive, that is, when overloading occurs in one resource pool, servers are moved from another (quieter) pool to meet this demand. Switching servers comes with some overhead, so it is important to weigh up the costs of the switch against possible system gains. In this thesis we combine the reactive behaviour of two server switching policies – the Proportional Switching Policy (PSP) and the Bottleneck Aware Switching Policy (BSP) – with the proactive properties of several workload forecasting models. We evaluate the behaviour of the two switching policies and compare them against static resource allocation under a range of reallocation intervals (the time it takes to switch a server from one resource pool to another) and observe that larger reallocation intervals have a negative impact on revenue. We also construct model- and simulation-based environments in which the combination of workload prediction and dynamic server switching can be explored. Several different (but common) predictors – Last Observation (LO), Simple Average (SA), Sample Moving Average (SMA) and Exponential Moving Average (EMA), Low Pass Filter (LPF), and an AutoRegressive Integrated Moving Average (ARIMA) – have been applied alongside the switching policies. As each of the forecasting schemes has its own bias, we also develop a number of meta-forecasting algorithms – the Active Window Model (AWM), the Voting Model (VM), the Selective Model (SM), the Dynamic Active Window Model (DAWM), and a method based on Workload Pattern Analysis (WPA). The schemes are tested with real-world workload traces from several sources to ensure consistent and improved results. We also investigate the effectiveness of these schemes on workloads containing extreme events (e.g. flash crowds). The results show that workload forecasting can be very effective when applied alongside dynamic resource allocation strategies

    Revenue maximization problems in commercial data centers

    Get PDF
    PhD ThesisAs IT systems are becoming more important everyday, one of the main concerns is that users may face major problems and eventually incur major costs if computing systems do not meet the expected performance requirements: customers expect reliability and performance guarantees, while underperforming systems loose revenues. Even with the adoption of data centers as the hub of IT organizations and provider of business efficiencies the problems are not over because it is extremely difficult for service providers to meet the promised performance guarantees in the face of unpredictable demand. One possible approach is the adoption of Service Level Agreements (SLAs), contracts that specify a level of performance that must be met and compensations in case of failure. In this thesis I will address some of the performance problems arising when IT companies sell the service of running ‘jobs’ subject to Quality of Service (QoS) constraints. In particular, the aim is to improve the efficiency of service provisioning systems by allowing them to adapt to changing demand conditions. First, I will define the problem in terms of an utility function to maximize. Two different models are analyzed, one for single jobs and the other useful to deal with session-based traffic. Then, I will introduce an autonomic model for service provision. The architecture consists of a set of hosted applications that share a certain number of servers. The system collects demand and performance statistics and estimates traffic parameters. These estimates are used by management policies which implement dynamic resource allocation and admission algorithms. Results from a number of experiments show that the performance of these heuristics is close to optimal.QoSP (Quality of Service Provisioning)British Teleco

    Revenue maximization problems in commercial data centers

    Get PDF
    As IT systems are becoming more important everyday, one of the main concerns is that users may face major problems and eventually incur major costs if computing systems do not meet the expected performance requirements: customers expect reliability and performance guarantees, while underperforming systems loose revenues. Even with the adoption of data centers as the hub of IT organizations and provider of business efficiencies the problems are not over because it is extremely difficult for service providers to meet the promised performance guarantees in the face of unpredictable demand. One possible approach is the adoption of Service Level Agreements (SLAs), contracts that specify a level of performance that must be met and compensations in case of failure. In this thesis I will address some of the performance problems arising when IT companies sell the service of running ‘jobs’ subject to Quality of Service (QoS) constraints. In particular, the aim is to improve the efficiency of service provisioning systems by allowing them to adapt to changing demand conditions. First, I will define the problem in terms of an utility function to maximize. Two different models are analyzed, one for single jobs and the other useful to deal with session-based traffic. Then, I will introduce an autonomic model for service provision. The architecture consists of a set of hosted applications that share a certain number of servers. The system collects demand and performance statistics and estimates traffic parameters. These estimates are used by management policies which implement dynamic resource allocation and admission algorithms. Results from a number of experiments show that the performance of these heuristics is close to optimal.EThOS - Electronic Theses Online ServiceQoSP (Quality of Service Provisioning) : British TelecomGBUnited Kingdo
    corecore