99 research outputs found

    Load Balancing in the Non-Degenerate Slowdown Regime

    Full text link
    We analyse Join-the-Shortest-Queue in a contemporary scaling regime known as the Non-Degenerate Slowdown regime. Join-the-Shortest-Queue (JSQ) is a classical load balancing policy for queueing systems with multiple parallel servers. Parallel server queueing systems are regularly analysed and dimensioned by diffusion approximations achieved in the Halfin-Whitt scaling regime. However, when jobs must be dispatched to a server upon arrival, we advocate the Non-Degenerate Slowdown regime (NDS) to compare different load-balancing rules. In this paper we identify novel diffusion approximation and timescale separation that provides insights into the performance of JSQ. We calculate the price of irrevocably dispatching jobs to servers and prove this to within 15% (in the NDS regime) of the rules that may manoeuvre jobs between servers. We also compare ours results for the JSQ policy with the NDS approximations of many modern load balancing policies such as Idle-Queue-First and Power-of-dd-choices policies which act as low information proxies for the JSQ policy. Our analysis leads us to construct new rules that have identical performance to JSQ but require less communication overhead than power-of-2-choices.Comment: Revised journal submission versio

    Stability of JSQ in queues with general server-job class compatibilities

    Get PDF
    We consider Poisson streams of exponentially distributed jobs arriving at each edge of a hypergraph of queues. Upon arrival, an incoming job is routed to the shortest queue among the corresponding vertices. This generalizes many known models such as power-of-d load balancing and JSQ (join the shortest queue) on generic graphs. We prove that stability in this model is achieved if and only if there exists a stable static routing policy. This stability condition is equivalent to that of the JSW (join the shortest workload) policy. We show that some graph topologies lead to a loss of capacity, implying more restrictive stability conditions than in, for example, complete graphs.Fil: Cruise, James. Heriot-watt University; Reino UnidoFil: Jonckheere, Matthieu Thimothy Samson. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Shneer, Seva. Heriot-watt University; Reino Unid

    Self-Learning Threshold-Based Load Balancing

    Get PDF
    We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality-of-service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the fluid and diffusion scales, while only involving a small communication overhead, which is crucial for large-scale deployments. In order to set the threshold optimally, it is important, however, to learn the load of the system, which may be unknown. For that purpose, we design a control rule for tuning the threshold in an online manner. We derive conditions which guarantee that this adaptive threshold settles at the optimal value, along with estimates for the time until this happens. In addition, we provide numerical experiments which support the theoretical results and further indicate that our policy copes effectively with time-varying demand patterns.Comment: 51 pages, 6 figure

    Flow-level performance analysis of data networks using processor sharing models

    Get PDF
    Most telecommunication systems are dynamic in nature. The state of the network changes constantly as new transmissions appear and depart. In order to capture the behavior of such systems and to realistically evaluate their performance, it is essential to use dynamic models in the analysis. In this thesis, we model and analyze networks carrying elastic data traffic at flow level using stochastic queueing systems. We develop performance analysis methodology, as well as model and analyze example systems. The exact analysis of stochastic models is difficult and usually becomes computationally intractable when the size of the network increases, and hence efficient approximative methods are needed. In this thesis, we use two performance approximation methods. Value extrapolation is a novel approximative method developed during this work and based on the theory of Markov decision processes. It can be used to approximate the performance measures of Markov processes. When applied to queueing systems, value extrapolation makes possible heavy state space truncation while providing accurate results without significant computational penalties. Balanced fairness is a capacity allocation scheme recently introduced by Bonald and Proutière that simplifies performance analysis and requires less restrictive assumptions about the traffic than other capacity allocation schemes. We introduce an approximation method based on balanced fairness and the Monte Carlo method for evaluating large sums that can be used to estimate the performance of systems of moderate size with low or medium loads. The performance analysis methods are applied in two settings: load balancing in fixed networks and the analysis of wireless networks. The aim of load balancing is to divide the traffic load efficiently between the network resources in order to improve the performance. On the basis of the insensitivity results of Bonald and Proutière, we study both packet- and flow-level balancing in fixed data networks. We also study load balancing between multiple parallel discriminatory processor sharing queues and compare different balancing policies. In the final part of the thesis, we analyze the performance of wireless networks carrying elastic data traffic. Wireless networks are gaining more and more popularity, as their advantages, such as easier deployment and mobility, outweigh their downsides. First, we discuss a simple cellular network with link adaptation consisting of two base stations and customers located on a line between them. We model the system and analyze the performance using different capacity allocation policies. Wireless multihop networks are analyzed using two different MAC schemes. On the basis of earlier work by Penttinen et al., we analyze the performance of networks using the STDMA MAC protocol. We also study multihop networks with random access, assuming that the transmission probabilities can be adapted upon flow arrivals and departures. We compare the throughput behavior of flow-optimized random access against the throughput obtained by optimal scheduling assuming balanced fairness capacity allocation

    On Occupancy Based Randomized Load Balancing for Large Systems with General Distributions

    Get PDF
    Multi-server architectures are ubiquitous in today's information infrastructure whether for supporting cloud services, web servers, or for distributed storage. The performance of multi-server systems is highly dependent on the load distribution. This is affected by the use of load balancing strategies. Since both latency and blocking are important features, it is most reasonable to route an incoming job to a server that is lightly loaded. Hence a good load balancing policy should be dependent on the states of servers. Since obtaining information about the remaining workload of servers for every arrival is very hard, it is preferable to design load balancing policies that depend on occupancy or the number of progressing jobs of servers. Furthermore, if the system has a large number of servers, it is not practical to use the occupancy information of all the servers to dispatch or route an arrival due to high communication cost. In large-scale systems that have tens of thousands of servers, the policies which use the occupancy information of only a finite number of randomly selected servers to dispatch an arrival result in lower implementation cost than the policies which use the occupancy information of all the servers. Such policies are referred to as occupancy based randomized load balancing policies. Motivated by cloud computing systems and web-server farms, we study two types of models. In the first model, each server is an Erlang loss server, and this model is an abstraction of Infrastructure-as-a-Service (IaaS) clouds. The second model we consider is one with processor sharing servers that is an abstraction of web-server farms which serve requests in a round-robin manner with small time granularity. The performance criterion for web-servers is the response time or the latency for the request to be processed. In most prior works, the analysis of these models was restricted to the case of exponential job length distributions and in this dissertation we study the case of general job length distributions. To analyze the impact of a load balancing policy, we need to develop models for the system's dynamics. In this dissertation, we show that one can construct useful Markovian models. For occupancy based randomized routing policies, due to complex inter-dependencies between servers, an exact analysis is mostly intractable. However, we show that the multi-server systems that have an occupancy based randomized load balancing policy are examples of weakly interacting particle systems. In these systems, servers are interacting particles whose states lie in an uncountable state space. We develop a mean-field analysis to understand a server's behavior as the number of servers becomes large. We show that under certain assumptions, as the number of servers increases, the sequence of empirical measure-valued Markov processes which model the systems' dynamics converges to a deterministic measure-valued process referred to as the mean-field limit. We observe that the mean-field equations correspond to the dynamics of the distribution of a non-linear Markov process. A consequence of having the mean-field limit is that under minor and natural assumptions on the initial states of servers, any finite set of servers can be shown to be independent of each other as the number of servers goes to infinity. Furthermore, the mean-field limit approximates each server's distribution in the transient regime when the number of servers is large. A salient feature of loss and processor sharing systems in the setting where their time evolution can be modeled by reversible Markov processes is that their stationary occupancy distribution is insensitive to the type of job length distribution; it depends only on the average job length but not on the type of the distribution. This property does not hold when the number of servers is finite in our context due to lack of reversibility. We show however that the fixed-point of the mean-field is insensitive to the job length distributions for all occupancy based randomized load balancing policies when the fixed-point is unique for job lengths that have exponential distributions. We also provide some deeper insights into the relationship between the mean-field and the distributions of servers and the empirical measure in the stationary regime. Finally, we address the accuracy of mean-field approximations in the case of loss models. To do so we establish a functional central limit theorem under the assumption that the job lengths have exponential distributions. We show that a suitably scaled fluctuation of the stochastic empirical process around the mean-field converges to an Ornstein-Uhlenbeck process. Our analysis is also valid for the Halfin-Whitt regime in which servers are critically loaded. We then exploit the functional central limit theorem to quantify the error between the actual blocking probability of the system with a large number of servers and the blocking probability obtained from the fixed-point of the mean-field. In the Halfin-Whitt regime, the error is of the order inverse square root of the number of servers. On the other hand, for a light load regime, the error is smaller than the inverse square root of the number of servers

    EUROPEAN CONFERENCE ON QUEUEING THEORY 2016

    Get PDF
    International audienceThis booklet contains the proceedings of the second European Conference in Queueing Theory (ECQT) that was held from the 18th to the 20th of July 2016 at the engineering school ENSEEIHT, Toulouse, France. ECQT is a biannual event where scientists and technicians in queueing theory and related areas get together to promote research, encourage interaction and exchange ideas. The spirit of the conference is to be a queueing event organized from within Europe, but open to participants from all over the world. The technical program of the 2016 edition consisted of 112 presentations organized in 29 sessions covering all trends in queueing theory, including the development of the theory, methodology advances, computational aspects and applications. Another exciting feature of ECQT2016 was the institution of the Takács Award for outstanding PhD thesis on "Queueing Theory and its Applications"

    Scalable Load Balancing Algorithms in Networked Systems

    Get PDF
    A fundamental challenge in large-scale networked systems viz., data centers and cloud networks is to distribute tasks to a pool of servers, using minimal instantaneous state information, while providing excellent delay performance. In this thesis we design and analyze load balancing algorithms that aim to achieve a highly efficient distribution of tasks, optimize server utilization, and minimize communication overhead.Comment: Ph.D. thesi

    Quantile Approximation of the Erlang Distribution using Differential Evolution Algorithm

    Get PDF
    Erlang distribution is a particular case of the gamma distribution and is often used in modeling queues, traffic congestion in wireless sensor networks, cell residence duration and finding the optimal queueing model to reduce the probability of blocking. The application is limited because of the unavailability of closed-form expression for the quantile (inverse cumulative distribution) function of the distribution. The problem is primarily tackled using approximation since the inversion method cannot be applied. This paper extended a six parameter quantile model earlier proposed to the Nakagami distribution to the Erlang distributions. Consequently, the established relationship between the two distributions is now extended to their quantile functions. The quantile model was used to fit the machine (R software) values with their corresponding quartiles in two ways. Firstly, artificial neural network (ANN) was used to establish that a curve fitting can be achieved. Lastly, differential evolution (DE) algorithm was used to minimize the errors obtained from the curve fitting and hence estimate the values of the six parameters of the quantile model that will ensure the best possible fit, for different values of the parameters that characterize Erlang distribution. Hence, the problem is constrained optimization in nature and the DE algorithm was able to find the different values of the parameters of the quantile model. The simulation result corroborates theoretical findings. The work is a welcome result for the quest for a universal quantile model that can be applied to different distributions
    • …
    corecore