554 research outputs found
Dependence-driven techniques in system design
Burstiness in workloads is often found in multi-tier architectures, storage systems, and communication networks. This feature is extremely important in system design because it can significantly degrade system performance and availability. This dissertation focuses on how to use knowledge of burstiness to develop new techniques and tools for performance prediction, scheduling, and resource allocation under bursty workload conditions.;For multi-tier enterprise systems, burstiness in the service times is catastrophic for performance. Via detailed experimentation, we identify the cause of performance degradation on the persistent bottleneck switch among various servers. This results in an unstable behavior that cannot be captured by existing capacity planning models. In this dissertation, beyond identifying the cause and effects of bottleneck switch in multi-tier systems, we also propose modifications to the classic TPC-W benchmark to emulate bursty arrivals in multi-tier systems.;This dissertation also demonstrates how burstiness can be used to improve system performance. Two dependence-driven scheduling policies, SWAP and ALoC, are developed. These general scheduling policies counteract burstiness in workloads and maintain high availability by delaying selected requests that contribute to burstiness. Extensive experiments show that both SWAP and ALoC achieve good estimates of service times based on the knowledge of burstiness in the service process. as a result, SWAP successfully approximates the shortest job first (SJF) scheduling without requiring a priori information of job service times. ALoC adaptively controls system load by infinitely delaying only a small fraction of the incoming requests.;The knowledge of burstiness can also be used to forecast the length of idle intervals in storage systems. In practice, background activities are scheduled during system idle times. The scheduling of background jobs is crucial in terms of the performance degradation of foreground jobs and the utilization of idle times. In this dissertation, new background scheduling schemes are designed to determine when and for how long idle times can be used for serving background jobs, without violating predefined performance targets of foreground jobs. Extensive trace-driven simulation results illustrate that the proposed schemes are effective and robust in a wide range of system conditions. Furthermore, if there is burstiness within idle times, then maintenance features like disk scrubbing and intra-disk data redundancy can be successfully scheduled as background activities during idle times
Revenue maximization problems in commercial data centers
PhD ThesisAs IT systems are becoming more important everyday, one of the main concerns is that users may
face major problems and eventually incur major costs if computing systems do not meet the expected
performance requirements: customers expect reliability and performance guarantees, while
underperforming systems loose revenues. Even with the adoption of data centers as the hub of
IT organizations and provider of business efficiencies the problems are not over because it is extremely
difficult for service providers to meet the promised performance guarantees in the face of
unpredictable demand. One possible approach is the adoption of Service Level Agreements (SLAs),
contracts that specify a level of performance that must be met and compensations in case of failure.
In this thesis I will address some of the performance problems arising when IT companies sell
the service of running ‘jobs’ subject to Quality of Service (QoS) constraints. In particular, the aim
is to improve the efficiency of service provisioning systems by allowing them to adapt to changing
demand conditions.
First, I will define the problem in terms of an utility function to maximize. Two different models
are analyzed, one for single jobs and the other useful to deal with session-based traffic. Then,
I will introduce an autonomic model for service provision. The architecture consists of a set of
hosted applications that share a certain number of servers. The system collects demand and performance
statistics and estimates traffic parameters. These estimates are used by management policies
which implement dynamic resource allocation and admission algorithms. Results from a number of
experiments show that the performance of these heuristics is close to optimal.QoSP (Quality of Service Provisioning)British Teleco
ASIdE: Using Autocorrelation-Based Size Estimation for Scheduling Bursty Workloads.
Temporal dependence in workloads creates peak congestion that can make service unavailable and reduce system performance. To improve system performability under conditions of temporal dependence, a server should quickly process bursts of requests that may need large service demands. In this paper, we propose and evaluateASIdE, an Autocorrelation-based SIze Estimation, that selectively delays requests which contribute to the workload temporal dependence. ASIdE implicitly approximates the shortest job first (SJF) scheduling policy but without any prior knowledge of job service times. Extensive experiments show that (1) ASIdE achieves good service time estimates from the temporal dependence structure of the workload to implicitly approximate the behavior of SJF; and (2) ASIdE successfully counteracts peak congestion in the workload and improves system performability under a wide variety of settings. Specifically, we show that system capacity under ASIdE is largely increased compared to the first-come first-served (FCFS) scheduling policy and is highly-competitive with SJF. © 2012 IEEE
Revisiting Size-Based Scheduling with Estimated Job Sizes
We study size-based schedulers, and focus on the impact of inaccurate job
size information on response time and fairness. Our intent is to revisit
previous results, which allude to performance degradation for even small errors
on job size estimates, thus limiting the applicability of size-based
schedulers.
We show that scheduling performance is tightly connected to workload
characteristics: in the absence of large skew in the job size distribution,
even extremely imprecise estimates suffice to outperform size-oblivious
disciplines. Instead, when job sizes are heavily skewed, known size-based
disciplines suffer.
In this context, we show -- for the first time -- the dichotomy of
over-estimation versus under-estimation. The former is, in general, less
problematic than the latter, as its effects are localized to individual jobs.
Instead, under-estimation leads to severe problems that may affect a large
number of jobs.
We present an approach to mitigate these problems: our technique requires no
complex modifications to original scheduling policies and performs very well.
To support our claim, we proceed with a simulation-based evaluation that covers
an unprecedented large parameter space, which takes into account a variety of
synthetic and real workloads.
As a consequence, we show that size-based scheduling is practical and
outperforms alternatives in a wide array of use-cases, even in presence of
inaccurate size information.Comment: To be published in the proceedings of IEEE MASCOTS 201
PSBS: Practical Size-Based Scheduling
Size-based schedulers have very desirable performance properties: optimal or
near-optimal response time can be coupled with strong fairness guarantees.
Despite this, such systems are very rarely implemented in practical settings,
because they require knowing a priori the amount of work needed to complete
jobs: this assumption is very difficult to satisfy in concrete systems. It is
definitely more likely to inform the system with an estimate of the job sizes,
but existing studies point to somewhat pessimistic results if existing
scheduler policies are used based on imprecise job size estimations. We take
the goal of designing scheduling policies that are explicitly designed to deal
with inexact job sizes: first, we show that existing size-based schedulers can
have bad performance with inexact job size information when job sizes are
heavily skewed; we show that this issue, and the pessimistic results shown in
the literature, are due to problematic behavior when large jobs are
underestimated. Once the problem is identified, it is possible to amend
existing size-based schedulers to solve the issue. We generalize FSP -- a fair
and efficient size-based scheduling policy -- in order to solve the problem
highlighted above; in addition, our solution deals with different job weights
(that can be assigned to a job independently from its size). We provide an
efficient implementation of the resulting protocol, which we call Practical
Size-Based Scheduler (PSBS). Through simulations evaluated on synthetic and
real workloads, we show that PSBS has near-optimal performance in a large
variety of cases with inaccurate size information, that it performs fairly and
it handles correctly job weights. We believe that this work shows that PSBS is
indeed pratical, and we maintain that it could inspire the design of schedulers
in a wide array of real-world use cases.Comment: arXiv admin note: substantial text overlap with arXiv:1403.599
Adaptive TTL-Based Caching for Content Delivery
Content Delivery Networks (CDNs) deliver a majority of the user-requested
content on the Internet, including web pages, videos, and software downloads. A
CDN server caches and serves the content requested by users. Designing caching
algorithms that automatically adapt to the heterogeneity, burstiness, and
non-stationary nature of real-world content requests is a major challenge and
is the focus of our work. While there is much work on caching algorithms for
stationary request traffic, the work on non-stationary request traffic is very
limited. Consequently, most prior models are inaccurate for production CDN
traffic that is non-stationary.
We propose two TTL-based caching algorithms and provide provable guarantees
for content request traffic that is bursty and non-stationary. The first
algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic
approximation approach. Given a feasible target hit rate, we show that the hit
rate of d-TTL converges to its target value for a general class of bursty
traffic that allows Markov dependence over time and non-stationary arrivals.
The second algorithm called f-TTL uses two caches, each with its own TTL. The
first-level cache adaptively filters out non-stationary traffic, while the
second-level cache stores frequently-accessed stationary traffic. Given
feasible targets for both the hit rate and the expected cache size, f-TTL
asymptotically achieves both targets. We implement d-TTL and f-TTL and evaluate
both algorithms using an extensive nine-day trace consisting of 500 million
requests from a production CDN server. We show that both d-TTL and f-TTL
converge to their hit rate targets with an error of about 1.3%. But, f-TTL
requires a significantly smaller cache size than d-TTL to achieve the same hit
rate, since it effectively filters out the non-stationary traffic for
rarely-accessed objects
AWAIT: Efficient Overload Management for Busy Multi-tier Web Services under Bursty Workloads
The problem of service differentiation and admission control in web services that utilize a multi-tier architecture is more challenging than in a single-tiered one, especially in the presence of bursty conditions, i.e., when arrivals of user web sessions to the system are characterized by temporal surges in their arrival intensities and demands. We demonstrate that classic techniques for a session based admission control that are triggered by threshold violations are ineffective under bursty workload conditions, as user-perceived performance metrics rapidly and dramatically deteriorate, inadvertently leading the system to reject requests from already accepted user sessions, resulting in business loss. Here, as a solution for service differentiation of accepted user sessions we promote a methodology that is based on blocking, i.e., when the system operates in overload, requests from accepted sessions are not rejected but are instead stored in a blocking queue that effectively acts as a waiting room. The requests in the blocking queue implicitly become of higher priority and are served immediately after load subsides. Residence in the blocking queue comes with a performance cost as blocking time adds to the perceived end-to-end user response time. We present a novel autonomic session based admission control policy, called AWAIT, that adaptively adjusts the capacity of the blocking queue as a function of workload burstiness in order to meet predefined user service level objectives while keeping the portion of aborted accepted sessions to a minimum. Detailed simulations illustrate the effectiveness of AWAIT under different workload burstiness profiles and therefore strongly argue for its effectiveness
Towards Autonomic Service Provisioning Systems
This paper discusses our experience in building SPIRE, an autonomic system
for service provision. The architecture consists of a set of hosted Web
Services subject to QoS constraints, and a certain number of servers used to
run session-based traffic. Customers pay for having their jobs run, but require
in turn certain quality guarantees: there are different SLAs specifying charges
for running jobs and penalties for failing to meet promised performance
metrics. The system is driven by an utility function, aiming at optimizing the
average earned revenue per unit time. Demand and performance statistics are
collected, while traffic parameters are estimated in order to make dynamic
decisions concerning server allocation and admission control. Different utility
functions are introduced and a number of experiments aiming at testing their
performance are discussed. Results show that revenues can be dramatically
improved by imposing suitable conditions for accepting incoming traffic; the
proposed system performs well under different traffic settings, and it
successfully adapts to changes in the operating environment.Comment: 11 pages, 9 Figures,
http://www.wipo.int/pctdb/en/wo.jsp?WO=201002636
- …