Search CORE

19,888 research outputs found

Recommended from our members

Large deviations analysis of scheduling policies for a web server

Author: Yang Chang Woo, 1975-
Publication venue
Publication date: 01/12/2007
Field of study

With increasing demand and availability of bandwidth resources, there has been tremendous growth in the scale and speed of web servers. In web servers, scheduling plays an important role in resource allocation (for instance, bandwidth allocation, processor allocation, etc). However, as the scale of a system increases so does the number of activities/events in the system (e.g., job arrivals), as a consequence of which the analysis of scheduling becomes increasingly harder. In particular, the possible ways in which scheduling failure (e.g., queue overflow, excessively large delay, instability of a system) can occur becomes increasingly greater, thus making it more difficult to understand the behavior and develop design rules for scheduling algorithms. However, a well-known observation from large devi viations theory that large scale systems fails in a “most likely way” can potentially be used to simplify the design and analysis of scheduling. In this thesis, we study the implications and applications of this effect on scheduling in a web server accessed by a large number of sources. We analyze the delay distribution of scheduling policies for web servers under a many sources large deviation regime which models web servers in a large scale system well. Due to the difficulties brought on considering a large number of sources, only a small number of scheduling policies, such as First-Come-First-Serve (FCFS), General-ProcessorSharing (GPS), and Priority Queueing policies have been analyzed under the many sources regime. In particular, in a single queue single server setup the delay characteristics of only FCFS, Shortest-Job-First (SJF), and Longest-Job-First (LJF) has been analyzed. In this thesis, we study the Two-Dimensional-Queueing (2DQ) framework, a unifying queueing framework that allows the identification of the “most likely way” in which delay occurs, to analyze the delay of various unexplored scheduling policies. In conjunction with the 2DQ framework, we develop a new “cycle based” technique for understanding the large deviations tail probability of more complex policies. Using the combination of the 2DQ framework and the cycle based analysis, we first analyze two interesting scheduling policies, i.e., Shortest-Remaining-Processing-Time (SRPT) policy (which is mean delay optimal) and Processer-Sharing (PS) policy (which is a “fair” policy). We derive the asymptotic delay distributions (rate functions) of both policies and study their behavior across job sizes. Next, we address three problems in implementing the aforementioned scheduling policies: (i) end receivers may have bandwidth constraints that are not taken account in SRPT, (ii) the remaining processing time information might not be available to the web-server, and (iii) most actual implementations are variants of SRPT to reflect other implementation constraints and/or to jointly optimize other metrics in addition to delay, i.e., jitter, fairness, etc. To address these, we first develop finite-SRPT that takes into account the bandwidth constraint at the end receiver, and show that the policy shifts between SRPT and a PS-like policy depending on the bandwidth constraint. Second, we study the Least-Attained-Service (LAS) policy which is viewed as a good substitute for SRPT when the remaining job size is not available and we analyze the penalty associated with not using the remaining size information directly. Lastly, we analyze a class of scheduling policies known as SMART that contains many variants of SRPT with different fairness properties and show that all policies in the class have the same tail probability of delay across job sizes for a many sources regime. The results of this thesis facilitate the understanding of various scheduling policies under the many sources regime and provides an analytical queueing framework that can be used to understand other scheduling policies.Electrical and Computer Engineerin

Texas ScholarWorks

Admission Control and Scheduling for High-Performance WWW Servers

Author: Bestavros Azer
Katagai Naomi
Londoño Jorge M.
Publication venue: Boston University Computer Science Department
Publication date: 01/05/1998
Field of study

In this paper we examine a number of admission control and scheduling protocols for high-performance web servers based on a 2-phase policy for serving HTTP requests. The first "registration" phase involves establishing the TCP connection for the HTTP request and parsing/interpreting its arguments, whereas the second "service" phase involves the service/transmission of data in response to the HTTP request. By introducing a delay between these two phases, we show that the performance of a web server could be potentially improved through the adoption of a number of scheduling policies that optimize the utilization of various system components (e.g. memory cache and I/O). In addition, to its premise for improving the performance of a single web server, the delineation between the registration and service phases of an HTTP request may be useful for load balancing purposes on clusters of web servers. We are investigating the use of such a mechanism as part of the Commonwealth testbed being developed at Boston University

Boston University Institutional Repository (OpenBU)

Towards Autonomic Service Provisioning Systems

Author: Mazzucco Michele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

This paper discusses our experience in building SPIRE, an autonomic system for service provision. The architecture consists of a set of hosted Web Services subject to QoS constraints, and a certain number of servers used to run session-based traffic. Customers pay for having their jobs run, but require in turn certain quality guarantees: there are different SLAs specifying charges for running jobs and penalties for failing to meet promised performance metrics. The system is driven by an utility function, aiming at optimizing the average earned revenue per unit time. Demand and performance statistics are collected, while traffic parameters are estimated in order to make dynamic decisions concerning server allocation and admission control. Different utility functions are introduced and a number of experiments aiming at testing their performance are discussed. Results show that revenues can be dramatically improved by imposing suitable conditions for accepting incoming traffic; the proposed system performs well under different traffic settings, and it successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures, http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636

arXiv.org e-Print Archive

Crossref

Datacenter Traffic Control: Understanding Techniques and Trade-offs

Author: Noormohammadpour Mohammad
Raghavendra Cauligi S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/12/2017
Field of study

Datacenters provide cost-effective and flexible access to scalable compute and storage resources necessary for today's cloud computing needs. A typical datacenter is made up of thousands of servers connected with a large network and usually managed by one operator. To provide quality access to the variety of applications and services hosted on datacenters and maximize performance, it deems necessary to use datacenter networks effectively and efficiently. Datacenter traffic is often a mix of several classes with different priorities and requirements. This includes user-generated interactive traffic, traffic with deadlines, and long-running traffic. To this end, custom transport protocols and traffic management techniques have been developed to improve datacenter network performance. In this tutorial paper, we review the general architecture of datacenter networks, various topologies proposed for them, their traffic properties, general traffic control challenges in datacenters and general traffic control objectives. The purpose of this paper is to bring out the important characteristics of traffic control in datacenters and not to survey all existing solutions (as it is virtually impossible due to massive body of existing research). We hope to provide readers with a wide range of options and factors while considering a variety of traffic control mechanisms. We discuss various characteristics of datacenter traffic control including management schemes, transmission control, traffic shaping, prioritization, load balancing, multipathing, and traffic scheduling. Next, we point to several open challenges as well as new and interesting networking paradigms. At the end of this paper, we briefly review inter-datacenter networks that connect geographically dispersed datacenters which have been receiving increasing attention recently and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial

arXiv.org e-Print Archive

ZENODO

FigShare

A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing

Author: Buyya Rajkumar
Tian Wenhong
Xu Minxian
Publication venue: 'Wiley'
Publication date: 08/02/2017
Field of study

The emergence of cloud computing based on virtualization technologies brings huge opportunities to host virtual resource at low cost without the need of owning any infrastructure. Virtualization technologies enable users to acquire, configure and be charged on pay-per-use basis. However, Cloud data centers mostly comprise heterogeneous commodity servers hosting multiple virtual machines (VMs) with potential various specifications and fluctuating resource usages, which may cause imbalanced resource utilization within servers that may lead to performance degradation and service level agreements (SLAs) violations. To achieve efficient scheduling, these challenges should be addressed and solved by using load balancing strategies, which have been proved to be NP-hard problem. From multiple perspectives, this work identifies the challenges and analyzes existing algorithms for allocating VMs to PMs in infrastructure Clouds, especially focuses on load balancing. A detailed classification targeting load balancing algorithms for VM placement in cloud data centers is investigated and the surveyed algorithms are classified according to the classification. The goal of this paper is to provide a comprehensive and comparative understanding of existing literature and aid researchers by providing an insight for potential future enhancements.Comment: 22 Pages, 4 Figures, 4 Tables, in pres

arXiv.org e-Print Archive

University of Melbourne Institutional Repository