239 research outputs found
Exact asymptotics for fluid queues fed by multiple heavy-tailed on-off flows
We consider a fluid queue fed by multiple On-Off flows with heavy-tailed
(regularly varying) On periods. Under fairly mild assumptions, we prove that
the workload distribution is asymptotically equivalent to that in a reduced
system. The reduced system consists of a ``dominant'' subset of the flows, with
the original service rate subtracted by the mean rate of the other flows. We
describe how a dominant set may be determined from a simple knapsack
formulation. The dominant set consists of a ``minimally critical'' set of
On-Off flows with regularly varying On periods. In case the dominant set
contains just a single On-Off flow, the exact asymptotics for the reduced
system follow from known results. For the case of several
On-Off flows, we exploit a powerful intuitive argument to obtain the exact
asymptotics. Combined with the reduced-load equivalence, the results for the
reduced system provide a characterization of the tail of the workload
distribution for a wide range of traffic scenarios
Sample-path large deviations for tandem and priority queues with Gaussian inputs
This paper considers Gaussian flows multiplexed in a queueing network. A
single node being a useful but often incomplete setting, we examine more
advanced models. We focus on a (two-node) tandem queue, fed by a large number
of Gaussian inputs. With service rates and buffer sizes at both nodes scaled
appropriately, Schilder's sample-path large-deviations theorem can be applied
to calculate the asymptotics of the overflow probability of the second queue.
More specifically, we derive a lower bound on the exponential decay rate of
this overflow probability and present an explicit condition for the lower bound
to match the exact decay rate. Examples show that this condition holds for a
broad range of frequently used Gaussian inputs. The last part of the paper
concentrates on a model for a single node, equipped with a priority scheduling
policy. We show that the analysis of the tandem queue directly carries over to
this priority queueing system.Comment: Published at http://dx.doi.org/10.1214/105051605000000133 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Shot-noise queueing models
We provide a survey of so-called shot-noise queues: queueing models with the special feature that the server speed is proportional to the amount of work it faces. Several results are derived for the workload in an M/G/1 shot-noise queue and some of its variants. Furthermore, we give some attention to queues with general workload-dependent service speed. We also discuss linear stochastic fluid networks, and queues in which the input process is a shot-noise process
Stochastic Dynamic Programming and Stochastic Fluid-Flow Models in the Design and Analysis of Web-Server Farms
A Web-server farm is a specialized facility designed specifically for housing Web
servers catering to one or more Internet facing Web sites. In this dissertation, stochastic
dynamic programming technique is used to obtain the optimal admission control
policy with different classes of customers, and stochastic
uid-
ow models
are used to compute the performance measures in the network. The two types of
network traffic considered in this research are streaming (guaranteed bandwidth per
connection) and elastic (shares available bandwidth equally among connections).
We first obtain the optimal admission control policy using stochastic dynamic
programming, in which, based on the number of requests of each type being served,
a decision is made whether to allow or deny service to an incoming request. In
this subproblem, we consider a xed bandwidth capacity server, which allocates the
requested bandwidth to the streaming requests and divides all of the remaining bandwidth
equally among all of the elastic requests. The performance metric of interest in
this case will be the blocking probability of streaming traffic, which will be computed
in order to be able to provide Quality of Service (QoS) guarantees.
Next, we obtain bounds on the expected waiting time in the system for elastic
requests that enter the system. This will be done at the server level in such a way
that the total available bandwidth for the requests is constant. Trace data will be
converted to an ON-OFF source and
fluid-
flow models will be used for this analysis. The results are compared with both the mean waiting time obtained by simulating
real data, and the expected waiting time obtained using traditional queueing models.
Finally, we consider the network of servers and routers within the Web farm where
data from servers
flows and merges before getting transmitted to the requesting users
via the Internet. We compute the waiting time of the elastic requests at intermediate
and edge nodes by obtaining the distribution of the out
ow of the upstream node.
This out
ow distribution is obtained by using a methodology based on minimizing the
deviations from the constituent in
flows. This analysis also helps us to compute waiting
times at different bandwidth capacities, and hence obtain a suitable bandwidth to
promise or satisfy the QoS guarantees.
This research helps in obtaining performance measures for different traffic classes
at a Web-server farm so as to be able to promise or provide QoS guarantees; while at
the same time helping in utilizing the resources of the server farms efficiently, thereby
reducing the operational costs and increasing energy savings
Resource management of replicated service systems provisioned in the cloud
Service providers seek scalable and cost-effective cloud solutions for hosting their applications. Despite significant recent advances facilitating the deployment and management of services on cloud platforms, a number of challenges still remain. Service providers are confronted with time-varying requests for the provided applications, inter- dependencies between different components, performance variability of the procured virtual resources, and cost structures that differ from conventional data centers. Moreover, fulfilling service level agreements, such as the throughput and response time percentiles, becomes of paramount importance for ensuring business advantages.In this thesis, we explore service provisioning in clouds from multiple points of view. The aim is to best provide service replicas in the form of VMs to various service applications, such that their tail throughput and tail response times, as well as resource utilization, meet the service level agreements in the most cost effective manner. In particular, we develop models, algorithms and replication strategies that consider multi-tier composed services provisioned in clouds. We also investigate how a service provider can opportunistically take advantage of observed performance variability in the cloud. Finally, we provide means of guaranteeing tail throughput and response times in the face of performance variability of VMs, using Markov chain modeling and large deviation theory. We employ methods from analytical modeling, event-driven simulations and experiments. Overall, this thesis provides not only a multi-faceted approach to exploring several crucial aspects of hosting services in clouds, i.e., cost, tail throughput, and tail response times, but our proposed resource management strategies are also rigorously validated via trace-driven simulation and extensive experiment
- …