885 research outputs found

    Modulated Branching Processes, Origins of Power Laws and Queueing Duality

    Full text link
    Power law distributions have been repeatedly observed in a wide variety of socioeconomic, biological and technological areas. In many of the observations, e.g., city populations and sizes of living organisms, the objects of interest evolve due to the replication of their many independent components, e.g., births-deaths of individuals and replications of cells. Furthermore, the rates of the replication are often controlled by exogenous parameters causing periods of expansion and contraction, e.g., baby booms and busts, economic booms and recessions, etc. In addition, the sizes of these objects often have reflective lower boundaries, e.g., cities do not fall bellow a certain size, low income individuals are subsidized by the government, companies are protected by bankruptcy laws, etc. Hence, it is natural to propose reflected modulated branching processes as generic models for many of the preceding observations. Indeed, our main results show that the proposed mathematical models result in power law distributions under quite general polynomial Gartner-Ellis conditions, the generality of which could explain the ubiquitous nature of power law distributions. In addition, on a logarithmic scale, we establish an asymptotic equivalence between the reflected branching processes and the corresponding multiplicative ones. The latter, as recognized by Goldie (1991), is known to be dual to queueing/additive processes. We emphasize this duality further in the generality of stationary and ergodic processes.Comment: 36 pages, 2 figures; added references; a new theorem in Subsection 4.

    Large deviations analysis for the M/H2/n+MM/H_2/n + M queue in the Halfin-Whitt regime

    Full text link
    We consider the FCFS M/H2/n+MM/H_2/n + M queue in the Halfin-Whitt heavy traffic regime. It is known that the normalized sequence of steady-state queue length distributions is tight and converges weakly to a limiting random variable W. However, those works only describe W implicitly as the invariant measure of a complicated diffusion. Although it was proven by Gamarnik and Stolyar that the tail of W is sub-Gaussian, the actual value of limxx2log(P(W>x))\lim_{x \rightarrow \infty}x^{-2}\log(P(W >x)) was left open. In subsequent work, Dai and He conjectured an explicit form for this exponent, which was insensitive to the higher moments of the service distribution. We explicitly compute the true large deviations exponent for W when the abandonment rate is less than the minimum service rate, the first such result for non-Markovian queues with abandonments. Interestingly, our results resolve the conjecture of Dai and He in the negative. Our main approach is to extend the stochastic comparison framework of Gamarnik and Goldberg to the setting of abandonments, requiring several novel and non-trivial contributions. Our approach sheds light on several novel ways to think about multi-server queues with abandonments in the Halfin-Whitt regime, which should hold in considerable generality and provide new tools for analyzing these systems

    Transient laws of non-stationary queueing systems and their applications

    Get PDF
    Cover title.Includes bibliographical references (p. 37-39).Supported in part by a Presidential Young Investigator Award, with matching funds from Draper Laboratory. DDM-9158118D. Bertsimas and G. Mourtizinou

    Transient laws of non-stationary queueing systems and their applications

    Get PDF
    Cover title.Includes bibliographical references (p. 37-39).Supported in part by a Presidential Young Investigator Award, with matching funds from Draper Laboratory. DDM-9158118D. Bertsimas and G. Mourtizinou

    Concentration of measure and mixing for Markov chains

    Get PDF
    We consider Markovian models on graphs with local dynamics. We show that, under suitable conditions, such Markov chains exhibit both rapid convergence to equilibrium and strong concentration of measure in the stationary distribution. We illustrate our results with applications to some known chains from computer science and statistical mechanics.Comment: 28 page

    Unreliable Retrial Queues in a Random Environment

    Get PDF
    This dissertation investigates stability conditions and approximate steady-state performance measures for unreliable, single-server retrial queues operating in a randomly evolving environment. In such systems, arriving customers that find the server busy or failed join a retrial queue from which they attempt to regain access to the server at random intervals. Such models are useful for the performance evaluation of communications and computer networks which are characterized by time-varying arrival, service and failure rates. To model this time-varying behavior, we study systems whose parameters are modulated by a finite Markov process. Two distinct cases are analyzed. The first considers systems with Markov-modulated arrival, service, retrial, failure and repair rates assuming all interevent and service times are exponentially distributed. The joint process of the orbit size, environment state, and server status is shown to be a tri-layered, level-dependent quasi-birth-and-death (LDQBD) process, and we provide a necessary and sufficient condition for the positive recurrence of LDQBDs using classical techniques. Moreover, we apply efficient numerical algorithms, designed to exploit the matrix-geometric structure of the model, to compute the approximate steady-state orbit size distribution and mean congestion and delay measures. The second case assumes that customers bring generally distributed service requirements while all other processes are identical to the first case. We show that the joint process of orbit size, environment state and server status is a level-dependent, M/G/1-type stochastic process. By employing regenerative theory, and exploiting the M/G/1-type structure, we derive a necessary and sufficient condition for stability of the system. Finally, for the exponential model, we illustrate how the main results may be used to simultaneously select mean time customers spend in orbit, subject to bound and stability constraints

    Solution methods for controlled queueing networks

    Get PDF
    In this dissertation we look at a controlled queueing network where a controller routes the incoming arrivals to parallel queues using state-dependent rules. Besides this general arrival there are dedicated arrivals to each queue. The dedicated arrivals can only be served by their designated server, hence there is no routing decision involved. The goal of the controller is to find a stationary policy that will minimize the average number of customers in the system;The problem is modeled as a semi-Markov decision process and solved using techniques from the theory of Markov decision processes. We develop an efficient policy iteration based methodology which performs better than the value iteration method which is widely thought of as the best method to use for large-scale problems. The novelty in our approach is to use iterative methods in solving the system of linear equations, and also take advantage of the sparsity of matrices. The methodology could be used for other problems that are similar in nature. Using this methodology we solve much larger problems than reported in the literature. We also look at how several heuristic methods perform on our problem. No heuristic method is suitable to use for all instances. In general, however, these heuristic methods offer quick and reasonable solutions to very large problems

    Dynamical Modeling of Cloud Applications for Runtime Performance Management

    Get PDF
    Cloud computing has quickly grown to become an essential component in many modern-day software applications. It allows consumers, such as a provider of some web service, to quickly and on demand obtain the necessary computational resources to run their applications. It is desirable for these service providers to keep the running cost of their cloud application low while adhering to various performance constraints. This is made difficult due to the dynamics imposed by, e.g., resource contentions or changing arrival rate of users, and the fact that there exist multiple ways of influencing the performance of a running cloud application. To facilitate decision making in this environment, performance models can be introduced that relate the workload and different actions to important performance metrics.In this thesis, such performance models of cloud applications are studied. In particular, we focus on modeling using queueing theory and on the fluid model for approximating the often intractable dynamics of the queue lengths. First, existing results on how the fluid model can be obtained from the mean-field approximation of a closed queueing network are simplified and extended to allow for mixed networks. The queues are allowed to follow the processor sharing or delay disciplines, and can have multiple classes with phase-type service times. An improvement to this fluid model is then presented to increase accuracy when the \emph{system size}, i.e., number of servers, initial population, and arrival rate, is small. Furthermore, a closed-form approximation of the response time CDF is presented. The methods are tested in a series of simulation experiments and shown to be accurate. This mean-field fluid model is then used to derive a general fluid model for microservices with interservice delays. The model is shown to be completely extractable at runtime in a distributed fashion. It is further evaluated on a simple microservice application and found to accurately predict important performance metrics in most cases. Furthermore, a method is devised to reduce the cost of a running application by tuning load balancing parameters between replicas. The method is built on gradient stepping by applying automatic differentiation to the fluid model. This allows for arbitrarily defined cost functions and constraints, most notably including different response time percentiles. The method is tested on a simple application distributed over multiple computing clusters and is shown to reduce costs while adhering to percentile constraints. Finally, modeling of request cloning is studied using the novel concept of synchronized service. This allows certain forms of cloning over servers, each modeled with a single queue, to be equivalently expressed as one single queue. The concept is very general regarding the involved queueing discipline and distributions, but instead introduces new, less realistic assumptions. How the equivalent queue model is affected by relaxing these assumptions is studied considering the processor sharing discipline, and an extension to enable modeling of speculative execution is made. In a simulation campaign, it is shown that these relaxations only has a minor effect in certain cases