19,888 research outputs found
Recommended from our members
Large deviations analysis of scheduling policies for a web server
With increasing demand and availability of bandwidth resources, there has been tremendous
growth in the scale and speed of web servers. In web servers, scheduling plays an important
role in resource allocation (for instance, bandwidth allocation, processor allocation,
etc). However, as the scale of a system increases so does the number of activities/events
in the system (e.g., job arrivals), as a consequence of which the analysis of scheduling
becomes increasingly harder. In particular, the possible ways in which scheduling failure
(e.g., queue overflow, excessively large delay, instability of a system) can occur becomes
increasingly greater, thus making it more difficult to understand the behavior and develop
design rules for scheduling algorithms. However, a well-known observation from large devi
viations theory that large scale systems fails in a “most likely way” can potentially be used
to simplify the design and analysis of scheduling. In this thesis, we study the implications
and applications of this effect on scheduling in a web server accessed by a large number of
sources.
We analyze the delay distribution of scheduling policies for web servers under a
many sources large deviation regime which models web servers in a large scale system
well. Due to the difficulties brought on considering a large number of sources, only a small
number of scheduling policies, such as First-Come-First-Serve (FCFS), General-ProcessorSharing
(GPS), and Priority Queueing policies have been analyzed under the many sources
regime. In particular, in a single queue single server setup the delay characteristics of only
FCFS, Shortest-Job-First (SJF), and Longest-Job-First (LJF) has been analyzed.
In this thesis, we study the Two-Dimensional-Queueing (2DQ) framework, a unifying
queueing framework that allows the identification of the “most likely way” in which
delay occurs, to analyze the delay of various unexplored scheduling policies. In conjunction
with the 2DQ framework, we develop a new “cycle based” technique for understanding the
large deviations tail probability of more complex policies.
Using the combination of the 2DQ framework and the cycle based analysis, we
first analyze two interesting scheduling policies, i.e., Shortest-Remaining-Processing-Time
(SRPT) policy (which is mean delay optimal) and Processer-Sharing (PS) policy (which is a
“fair” policy). We derive the asymptotic delay distributions (rate functions) of both policies
and study their behavior across job sizes. Next, we address three problems in implementing
the aforementioned scheduling policies: (i) end receivers may have bandwidth constraints
that are not taken account in SRPT, (ii) the remaining processing time information might
not be available to the web-server, and (iii) most actual implementations are variants of
SRPT to reflect other implementation constraints and/or to jointly optimize other metrics
in addition to delay, i.e., jitter, fairness, etc. To address these, we first develop finite-SRPT
that takes into account the bandwidth constraint at the end receiver, and show that the policy
shifts between SRPT and a PS-like policy depending on the bandwidth constraint. Second,
we study the Least-Attained-Service (LAS) policy which is viewed as a good substitute
for SRPT when the remaining job size is not available and we analyze the penalty associated
with not using the remaining size information directly. Lastly, we analyze a class of
scheduling policies known as SMART that contains many variants of SRPT with different
fairness properties and show that all policies in the class have the same tail probability of
delay across job sizes for a many sources regime. The results of this thesis facilitate the
understanding of various scheduling policies under the many sources regime and provides
an analytical queueing framework that can be used to understand other scheduling policies.Electrical and Computer Engineerin
Admission Control and Scheduling for High-Performance WWW Servers
In this paper we examine a number of admission control and scheduling protocols for high-performance web servers based on a 2-phase policy for serving HTTP requests. The first "registration" phase involves establishing the TCP connection for the HTTP request and parsing/interpreting its arguments, whereas the second "service" phase involves the service/transmission of data in response to the HTTP request. By introducing a delay between these two phases, we show that the performance of a web server could be potentially improved through the adoption of a number of scheduling policies that optimize the utilization of various system components (e.g. memory cache and I/O). In addition, to its premise for improving the performance of a single web server, the delineation between the registration and service phases of an HTTP request may be useful for load balancing purposes on clusters of web servers. We are investigating the use of such a mechanism as part of the Commonwealth testbed being developed at Boston University
Towards Autonomic Service Provisioning Systems
This paper discusses our experience in building SPIRE, an autonomic system
for service provision. The architecture consists of a set of hosted Web
Services subject to QoS constraints, and a certain number of servers used to
run session-based traffic. Customers pay for having their jobs run, but require
in turn certain quality guarantees: there are different SLAs specifying charges
for running jobs and penalties for failing to meet promised performance
metrics. The system is driven by an utility function, aiming at optimizing the
average earned revenue per unit time. Demand and performance statistics are
collected, while traffic parameters are estimated in order to make dynamic
decisions concerning server allocation and admission control. Different utility
functions are introduced and a number of experiments aiming at testing their
performance are discussed. Results show that revenues can be dramatically
improved by imposing suitable conditions for accepting incoming traffic; the
proposed system performs well under different traffic settings, and it
successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures,
http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636
Datacenter Traffic Control: Understanding Techniques and Trade-offs
Datacenters provide cost-effective and flexible access to scalable compute
and storage resources necessary for today's cloud computing needs. A typical
datacenter is made up of thousands of servers connected with a large network
and usually managed by one operator. To provide quality access to the variety
of applications and services hosted on datacenters and maximize performance, it
deems necessary to use datacenter networks effectively and efficiently.
Datacenter traffic is often a mix of several classes with different priorities
and requirements. This includes user-generated interactive traffic, traffic
with deadlines, and long-running traffic. To this end, custom transport
protocols and traffic management techniques have been developed to improve
datacenter network performance.
In this tutorial paper, we review the general architecture of datacenter
networks, various topologies proposed for them, their traffic properties,
general traffic control challenges in datacenters and general traffic control
objectives. The purpose of this paper is to bring out the important
characteristics of traffic control in datacenters and not to survey all
existing solutions (as it is virtually impossible due to massive body of
existing research). We hope to provide readers with a wide range of options and
factors while considering a variety of traffic control mechanisms. We discuss
various characteristics of datacenter traffic control including management
schemes, transmission control, traffic shaping, prioritization, load balancing,
multipathing, and traffic scheduling. Next, we point to several open challenges
as well as new and interesting networking paradigms. At the end of this paper,
we briefly review inter-datacenter networks that connect geographically
dispersed datacenters which have been receiving increasing attention recently
and pose interesting and novel research problems.Comment: Accepted for Publication in IEEE Communications Surveys and Tutorial
A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing
The emergence of cloud computing based on virtualization technologies brings
huge opportunities to host virtual resource at low cost without the need of
owning any infrastructure. Virtualization technologies enable users to acquire,
configure and be charged on pay-per-use basis. However, Cloud data centers
mostly comprise heterogeneous commodity servers hosting multiple virtual
machines (VMs) with potential various specifications and fluctuating resource
usages, which may cause imbalanced resource utilization within servers that may
lead to performance degradation and service level agreements (SLAs) violations.
To achieve efficient scheduling, these challenges should be addressed and
solved by using load balancing strategies, which have been proved to be NP-hard
problem. From multiple perspectives, this work identifies the challenges and
analyzes existing algorithms for allocating VMs to PMs in infrastructure
Clouds, especially focuses on load balancing. A detailed classification
targeting load balancing algorithms for VM placement in cloud data centers is
investigated and the surveyed algorithms are classified according to the
classification. The goal of this paper is to provide a comprehensive and
comparative understanding of existing literature and aid researchers by
providing an insight for potential future enhancements.Comment: 22 Pages, 4 Figures, 4 Tables, in pres
- …