239 research outputs found

    Exact asymptotics for fluid queues fed by multiple heavy-tailed on-off flows

    Get PDF
    We consider a fluid queue fed by multiple On-Off flows with heavy-tailed (regularly varying) On periods. Under fairly mild assumptions, we prove that the workload distribution is asymptotically equivalent to that in a reduced system. The reduced system consists of a ``dominant'' subset of the flows, with the original service rate subtracted by the mean rate of the other flows. We describe how a dominant set may be determined from a simple knapsack formulation. The dominant set consists of a ``minimally critical'' set of On-Off flows with regularly varying On periods. In case the dominant set contains just a single On-Off flow, the exact asymptotics for the reduced system follow from known results. For the case of several On-Off flows, we exploit a powerful intuitive argument to obtain the exact asymptotics. Combined with the reduced-load equivalence, the results for the reduced system provide a characterization of the tail of the workload distribution for a wide range of traffic scenarios

    Overflow behavior in queues with many long-tailed inputs

    Get PDF

    Sample-path large deviations for tandem and priority queues with Gaussian inputs

    Get PDF
    This paper considers Gaussian flows multiplexed in a queueing network. A single node being a useful but often incomplete setting, we examine more advanced models. We focus on a (two-node) tandem queue, fed by a large number of Gaussian inputs. With service rates and buffer sizes at both nodes scaled appropriately, Schilder's sample-path large-deviations theorem can be applied to calculate the asymptotics of the overflow probability of the second queue. More specifically, we derive a lower bound on the exponential decay rate of this overflow probability and present an explicit condition for the lower bound to match the exact decay rate. Examples show that this condition holds for a broad range of frequently used Gaussian inputs. The last part of the paper concentrates on a model for a single node, equipped with a priority scheduling policy. We show that the analysis of the tandem queue directly carries over to this priority queueing system.Comment: Published at http://dx.doi.org/10.1214/105051605000000133 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Shot-noise queueing models

    Get PDF
    We provide a survey of so-called shot-noise queues: queueing models with the special feature that the server speed is proportional to the amount of work it faces. Several results are derived for the workload in an M/G/1 shot-noise queue and some of its variants. Furthermore, we give some attention to queues with general workload-dependent service speed. We also discuss linear stochastic fluid networks, and queues in which the input process is a shot-noise process

    Stochastic Dynamic Programming and Stochastic Fluid-Flow Models in the Design and Analysis of Web-Server Farms

    Get PDF
    A Web-server farm is a specialized facility designed specifically for housing Web servers catering to one or more Internet facing Web sites. In this dissertation, stochastic dynamic programming technique is used to obtain the optimal admission control policy with different classes of customers, and stochastic uid- ow models are used to compute the performance measures in the network. The two types of network traffic considered in this research are streaming (guaranteed bandwidth per connection) and elastic (shares available bandwidth equally among connections). We first obtain the optimal admission control policy using stochastic dynamic programming, in which, based on the number of requests of each type being served, a decision is made whether to allow or deny service to an incoming request. In this subproblem, we consider a xed bandwidth capacity server, which allocates the requested bandwidth to the streaming requests and divides all of the remaining bandwidth equally among all of the elastic requests. The performance metric of interest in this case will be the blocking probability of streaming traffic, which will be computed in order to be able to provide Quality of Service (QoS) guarantees. Next, we obtain bounds on the expected waiting time in the system for elastic requests that enter the system. This will be done at the server level in such a way that the total available bandwidth for the requests is constant. Trace data will be converted to an ON-OFF source and fluid- flow models will be used for this analysis. The results are compared with both the mean waiting time obtained by simulating real data, and the expected waiting time obtained using traditional queueing models. Finally, we consider the network of servers and routers within the Web farm where data from servers flows and merges before getting transmitted to the requesting users via the Internet. We compute the waiting time of the elastic requests at intermediate and edge nodes by obtaining the distribution of the out ow of the upstream node. This out ow distribution is obtained by using a methodology based on minimizing the deviations from the constituent in flows. This analysis also helps us to compute waiting times at different bandwidth capacities, and hence obtain a suitable bandwidth to promise or satisfy the QoS guarantees. This research helps in obtaining performance measures for different traffic classes at a Web-server farm so as to be able to promise or provide QoS guarantees; while at the same time helping in utilizing the resources of the server farms efficiently, thereby reducing the operational costs and increasing energy savings

    Resource management of replicated service systems provisioned in the cloud

    Get PDF
    Service providers seek scalable and cost-effective cloud solutions for hosting their applications. Despite significant recent advances facilitating the deployment and management of services on cloud platforms, a number of challenges still remain. Service providers are confronted with time-varying requests for the provided applications, inter- dependencies between different components, performance variability of the procured virtual resources, and cost structures that differ from conventional data centers. Moreover, fulfilling service level agreements, such as the throughput and response time percentiles, becomes of paramount importance for ensuring business advantages.In this thesis, we explore service provisioning in clouds from multiple points of view. The aim is to best provide service replicas in the form of VMs to various service applications, such that their tail throughput and tail response times, as well as resource utilization, meet the service level agreements in the most cost effective manner. In particular, we develop models, algorithms and replication strategies that consider multi-tier composed services provisioned in clouds. We also investigate how a service provider can opportunistically take advantage of observed performance variability in the cloud. Finally, we provide means of guaranteeing tail throughput and response times in the face of performance variability of VMs, using Markov chain modeling and large deviation theory. We employ methods from analytical modeling, event-driven simulations and experiments. Overall, this thesis provides not only a multi-faceted approach to exploring several crucial aspects of hosting services in clouds, i.e., cost, tail throughput, and tail response times, but our proposed resource management strategies are also rigorously validated via trace-driven simulation and extensive experiment

    Queueing Systems with Heavy Tails

    Get PDF
    corecore