1,988 research outputs found

    Randomized Assignment of Jobs to Servers in Heterogeneous Clusters of Shared Servers for Low Delay

    Get PDF
    We consider the job assignment problem in a multi-server system consisting of NN parallel processor sharing servers, categorized into MM (≪N\ll N) different types according to their processing capacity or speed. Jobs of random sizes arrive at the system according to a Poisson process with rate NλN \lambda. Upon each arrival, a small number of servers from each type is sampled uniformly at random. The job is then assigned to one of the sampled servers based on a selection rule. We propose two schemes, each corresponding to a specific selection rule that aims at reducing the mean sojourn time of jobs in the system. We first show that both methods achieve the maximal stability region. We then analyze the system operating under the proposed schemes as N→∞N \to \infty which corresponds to the mean field. Our results show that asymptotic independence among servers holds even when MM is finite and exchangeability holds only within servers of the same type. We further establish the existence and uniqueness of stationary solution of the mean field and show that the tail distribution of server occupancy decays doubly exponentially for each server type. When the estimates of arrival rates are not available, the proposed schemes offer simpler alternatives to achieving lower mean sojourn time of jobs, as shown by our numerical studies

    The mean-field behavior of processor sharing systems with general job lengths under the SQ(d) policy

    Get PDF
    This paper addresses the mean-field behavior of large-scale systems of parallel servers with a processor sharing service discipline when arrivals are Poisson and jobs have general service time distributions when an SQ() routing policy is used. Under this policy, an arrival is routed to the server with the least number of progressing jobs among randomly chosen servers. The limit of the empirical distribution is then used to study the statistical properties of the system. In particular, this shows that in the limit as grows, individual servers are statistically independent of others (propagation of chaos) and more importantly, the equilibrium point of the mean-field is insensitive to the job length distributions that has important engineering relevance for the robustness of such routing policies used in web server farms. We use a framework of measure-valued processes and martingale techniques to obtain our results. We also provide numerical results to support our analysis

    Mean field and propagation of chaos in multi-class heterogeneous loss models

    Get PDF
    We consider a system consisting of parallel servers, where jobs with different resource requirements arrive and are assigned to the servers for processing. Each server has a finite resource capacity and therefore can serve only a finite number of jobs at a time. We assume that different servers have different resource capacities. A job is accepted for processing only if the resource requested by the job is available at the server to which it is assigned. Otherwise, the job is discarded or blocked. We consider randomized schemes to assign jobs to servers with the aim of reducing the average blocking probability of jobs in the system. In particular, we consider a scheme that assigns an incoming job to the server having maximum available vacancy or unused resource among randomly sampled servers. We consider the system in the limit where both the number of servers and the arrival rates of jobs are scaled by a large factor. This gives rise to a mean field analysis. We show that in the limiting system the servers behave independently—a property termed as propagation of chaos. Stationary tail probabilities of server occupancies are obtained from the stationary solution of the mean field which is shown to be unique and globally attractive. We further characterize the rate of decay of the stationary tail probabilities. Numerical results suggest that the proposed scheme significantly reduces the average blocking probability of jobs as compared to static schemes that probabilistically route jobs to servers independently of their states

    Mean field and propagation of chaos in multi-class heterogeneous loss models

    Get PDF
    We consider a system consisting of parallel servers, where jobs with different resource requirements arrive and are assigned to the servers for processing. Each server has a finite resource capacity and therefore can serve only a finite number of jobs at a time. We assume that different servers have different resource capacities. A job is accepted for processing only if the resource requested by the job is available at the server to which it is assigned. Otherwise, the job is discarded or blocked. We consider randomized schemes to assign jobs to servers with the aim of reducing the average blocking probability of jobs in the system. In particular, we consider a scheme that assigns an incoming job to the server having maximum available vacancy or unused resource among randomly sampled servers. We consider the system in the limit where both the number of servers and the arrival rates of jobs are scaled by a large factor. This gives rise to a mean field analysis. We show that in the limiting system the servers behave independently—a property termed as propagation of chaos. Stationary tail probabilities of server occupancies are obtained from the stationary solution of the mean field which is shown to be unique and globally attractive. We further characterize the rate of decay of the stationary tail probabilities. Numerical results suggest that the proposed scheme significantly reduces the average blocking probability of jobs as compared to static schemes that probabilistically route jobs to servers independently of their states

    On Occupancy Based Randomized Load Balancing for Large Systems with General Distributions

    Get PDF
    Multi-server architectures are ubiquitous in today's information infrastructure whether for supporting cloud services, web servers, or for distributed storage. The performance of multi-server systems is highly dependent on the load distribution. This is affected by the use of load balancing strategies. Since both latency and blocking are important features, it is most reasonable to route an incoming job to a server that is lightly loaded. Hence a good load balancing policy should be dependent on the states of servers. Since obtaining information about the remaining workload of servers for every arrival is very hard, it is preferable to design load balancing policies that depend on occupancy or the number of progressing jobs of servers. Furthermore, if the system has a large number of servers, it is not practical to use the occupancy information of all the servers to dispatch or route an arrival due to high communication cost. In large-scale systems that have tens of thousands of servers, the policies which use the occupancy information of only a finite number of randomly selected servers to dispatch an arrival result in lower implementation cost than the policies which use the occupancy information of all the servers. Such policies are referred to as occupancy based randomized load balancing policies. Motivated by cloud computing systems and web-server farms, we study two types of models. In the first model, each server is an Erlang loss server, and this model is an abstraction of Infrastructure-as-a-Service (IaaS) clouds. The second model we consider is one with processor sharing servers that is an abstraction of web-server farms which serve requests in a round-robin manner with small time granularity. The performance criterion for web-servers is the response time or the latency for the request to be processed. In most prior works, the analysis of these models was restricted to the case of exponential job length distributions and in this dissertation we study the case of general job length distributions. To analyze the impact of a load balancing policy, we need to develop models for the system's dynamics. In this dissertation, we show that one can construct useful Markovian models. For occupancy based randomized routing policies, due to complex inter-dependencies between servers, an exact analysis is mostly intractable. However, we show that the multi-server systems that have an occupancy based randomized load balancing policy are examples of weakly interacting particle systems. In these systems, servers are interacting particles whose states lie in an uncountable state space. We develop a mean-field analysis to understand a server's behavior as the number of servers becomes large. We show that under certain assumptions, as the number of servers increases, the sequence of empirical measure-valued Markov processes which model the systems' dynamics converges to a deterministic measure-valued process referred to as the mean-field limit. We observe that the mean-field equations correspond to the dynamics of the distribution of a non-linear Markov process. A consequence of having the mean-field limit is that under minor and natural assumptions on the initial states of servers, any finite set of servers can be shown to be independent of each other as the number of servers goes to infinity. Furthermore, the mean-field limit approximates each server's distribution in the transient regime when the number of servers is large. A salient feature of loss and processor sharing systems in the setting where their time evolution can be modeled by reversible Markov processes is that their stationary occupancy distribution is insensitive to the type of job length distribution; it depends only on the average job length but not on the type of the distribution. This property does not hold when the number of servers is finite in our context due to lack of reversibility. We show however that the fixed-point of the mean-field is insensitive to the job length distributions for all occupancy based randomized load balancing policies when the fixed-point is unique for job lengths that have exponential distributions. We also provide some deeper insights into the relationship between the mean-field and the distributions of servers and the empirical measure in the stationary regime. Finally, we address the accuracy of mean-field approximations in the case of loss models. To do so we establish a functional central limit theorem under the assumption that the job lengths have exponential distributions. We show that a suitably scaled fluctuation of the stochastic empirical process around the mean-field converges to an Ornstein-Uhlenbeck process. Our analysis is also valid for the Halfin-Whitt regime in which servers are critically loaded. We then exploit the functional central limit theorem to quantify the error between the actual blocking probability of the system with a large number of servers and the blocking probability obtained from the fixed-point of the mean-field. In the Halfin-Whitt regime, the error is of the order inverse square root of the number of servers. On the other hand, for a light load regime, the error is smaller than the inverse square root of the number of servers

    Load Balancing in the Non-Degenerate Slowdown Regime

    Full text link
    We analyse Join-the-Shortest-Queue in a contemporary scaling regime known as the Non-Degenerate Slowdown regime. Join-the-Shortest-Queue (JSQ) is a classical load balancing policy for queueing systems with multiple parallel servers. Parallel server queueing systems are regularly analysed and dimensioned by diffusion approximations achieved in the Halfin-Whitt scaling regime. However, when jobs must be dispatched to a server upon arrival, we advocate the Non-Degenerate Slowdown regime (NDS) to compare different load-balancing rules. In this paper we identify novel diffusion approximation and timescale separation that provides insights into the performance of JSQ. We calculate the price of irrevocably dispatching jobs to servers and prove this to within 15% (in the NDS regime) of the rules that may manoeuvre jobs between servers. We also compare ours results for the JSQ policy with the NDS approximations of many modern load balancing policies such as Idle-Queue-First and Power-of-dd-choices policies which act as low information proxies for the JSQ policy. Our analysis leads us to construct new rules that have identical performance to JSQ but require less communication overhead than power-of-2-choices.Comment: Revised journal submission versio

    Choosing among heterogeneous server clouds

    Get PDF
    This paper considers a model of interest in cloud computing applications. We consider a multiserver system consisting of N heterogeneous servers. The servers are categorized into M( ≪N ) different types according to their service capabilities. Jobs having specific resource requirements arrive at the system according to a Poisson process with rate Nλ . Upon each arrival, a small number of servers are sampled uniformly at random from each server type. The job is then routed to the sampled server with maximum vacancy per server capacity. If a job cannot obtain the required amount of resources from the server to which it is assigned, then the job is discarded. We analyze the system in the limit as N→∞ . This gives rise to a mean field, which we show has a unique fixed point and is globally attractive. Furthermore, as N→∞ , the servers behave independently. The stationary tail probabilities of server occupancies are obtained from the stationary solution of the mean field. Numerical results suggest that the proposed scheme significantly reduces the average blocking probability compared to static schemes that probabilistically route jobs to servers in proportion to the number of servers of each type. Moreover, the reduction in blocking holds even for systems at high load. For the limiting system in statistical equilibrium, our simulation results indicate that the occupancy distribution is insensitive to the holding time distribution and only depends on its mean

    Analysis of randomized join-the-shortest-queue (JSQ) schemes in large heterogeneous processor-sharing systems

    Get PDF
    In this paper, we investigate the stability and performance of randomized dynamic routing schemes for jobs based on the Join-the-Shortest Queue (JSQ) criterion in a heterogeneous system of many parallel servers. In particular, we consider servers that use processor sharing but with different server rates, and jobs are routed to the server with the smallest occupancy among a finite number of randomly sampled servers. We focus on the case of two servers that is often referred to as a Power-of-Two scheme. We first show that in the heterogeneous setting, uniform sampling of servers can cause a loss in the stability region and thus such randomized dynamic schemes need not outperform static randomized schemes in terms of mean delay in opposition to the homogeneous case of equal server speeds where the stability region is maximal and coincides with that of the static randomized routing. We explicitly characterize the stationary distributions of the server occupancies and show that the tail distribution of the server occupancy has a super-exponential behavior as in the homogeneous case as the number of servers goes to infinity. To overcome the stability issue, we show that it is possible to combine the static state-independent scheme with a randomized JSQ scheme that allows us to recover the maximal stability region combined with the benefits of JSQ, and such a scheme is preferable in terms of average delay. The techniques are based on a mean field analysis where we show that the stationary distributions coincide with those obtained under asymptotic independence of the servers and, moreover, the stationary distributions are insensitive to the job-size distribution
    • …
    corecore