322 research outputs found

    Optimal Hyper-Scalable Load Balancing with a Strict Queue Limit

    Get PDF
    Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off between delay performance and implementation overhead (e.g. communication or memory usage). This trade-off has primarily been studied so far from the angle of the amount of overhead required to achieve asymptotically optimal performance, particularly vanishing delay in large-scale systems. In contrast, in the present paper, we focus on an arbitrarily sparse communication budget, possibly well below the minimum requirement for vanishing delay, referred to as the hyper-scalable operating region. Furthermore, jobs may only be admitted when a specific limit on the queue position of the job can be guaranteed. The centerpiece of our analysis is a universal upper bound for the achievable throughput of any dispatcher-driven algorithm for a given communication budget and queue limit. We also propose a specific hyper-scalable scheme which can operate at any given message rate and enforce any given queue limit, while allowing the server states to be captured via a closed product-form network, in which servers act as customers traversing various nodes. The product-form distribution is leveraged to prove that the bound is tight and that the proposed hyper-scalable scheme is throughput-optimal in a many-server regime given the communication and queue limit constraints. Extensive simulation experiments are conducted to illustrate the results

    On the Benefit of Information Centric Networks for Traffic Engineering

    Full text link
    Current Internet performs traffic engineering (TE) by estimating traffic matrices on a regular schedule, and allocating flows based upon weights computed from these matrices. This means the allocation is based upon a guess of the traffic in the network based on its history. Information-Centric Networks on the other hand provide a finer-grained description of the traffic: a content between a client and a server is uniquely identified by its name, and the network can therefore learn the size of different content items, and perform traffic engineering and resource allocation accordingly. We claim that Information-Centric Networks can therefore provide a better handle to perform traffic engineering, resulting in significant performance gain. We present a mechanism to perform such resource allocation. We see that our traffic engineering method only requires knowledge of the flow size (which, in ICN, can be learned from previous data transfers) and outperforms a min-MLU allocation in terms of response time. We also see that our method identifies the traffic allocation patterns similar to that of min-MLU without having access to the traffic matrix ahead of time. We show a very significant gain in response time where min MLU is almost 50% slower than our ICN-based TE method


    Get PDF
    International audienceThis booklet contains the proceedings of the second European Conference in Queueing Theory (ECQT) that was held from the 18th to the 20th of July 2016 at the engineering school ENSEEIHT, Toulouse, France. ECQT is a biannual event where scientists and technicians in queueing theory and related areas get together to promote research, encourage interaction and exchange ideas. The spirit of the conference is to be a queueing event organized from within Europe, but open to participants from all over the world. The technical program of the 2016 edition consisted of 112 presentations organized in 29 sessions covering all trends in queueing theory, including the development of the theory, methodology advances, computational aspects and applications. Another exciting feature of ECQT2016 was the institution of the Takács Award for outstanding PhD thesis on "Queueing Theory and its Applications"

    Sleep Mode Analysis via Workload Decomposition

    Full text link
    The goal of this paper is to establish a general approach for analyzing queueing models with repeated inhomogeneous vacations. The server goes on for a vacation if the inactivity prolongs more than the vacation trigger duration. Once the system enters in vacation mode, it may continue for several consecutive vacations. At the end of a vacation, the server goes on another vacation, possibly with a different probability distribution; if during the previous vacation there have been no arrivals. However the system enters in vacation mode only if the inactivity is persisted beyond defined trigger duration. In order to get an insight on the influence of parameters on the performance, we choose to study a simple M/G/1 queue (Poisson arrivals and general independent service times) which has the advantage of being tractable analytically. The theoretical model is applied to the problem of power saving for mobile devices in which the sleep durations of a device correspond to the vacations of the server. Various system performance metrics such as the frame response time and the economy of energy are derived. A constrained optimization problem is formulated to maximize the economy of energy achieved in power save mode, with constraints as QoS conditions to be met. An illustration of the proposed methods is shown with a WiMAX system scenario to obtain design parameters for better performance. Our analysis allows us not only to optimize the system parameters for a given traffic intensity but also to propose parameters that provide the best performance under worst case conditions

    Bridging the gap between dataplanes and commodity operating systems

    Get PDF
    The conventional wisdom is that aggressive networking requirements, such as high packet rates for small messages and microsecond-scale tail latency, are best addressed outside the kernel, in a user-level networking stack. In particular, dataplanes borrow design elements from network middleboxes to run tasks to completion in tight loops. In its basic form, the dataplane design leverages sweeping simplifications such as the elimination of any resource management and any task scheduling to improve throughput and lower latency. As a result, dataplanes perform best when the request rate is predictable (since there is no resource management) and the service time of each task has a low execution time and a low dispersion. On the other hand, they exhibit poor energy proportionality and workload consolidation, and suffer from head-of-line blocking. This thesis proposes the introduction of resource management to dataplanes. Current dataplanes decrease latency by constantly polling for incoming network packets. This approach trades energy usage for latency. We argue that it is possible to introduce a control plane, which manages the resources in the most optimal way in terms of power usage without affecting the performance of the dataplane. Additionally, this thesis proposes the introduction of scheduling to dataplanes. Current designs operate in a strict FIFO and run-to-completion manner. This method is effective only when the incoming request requires a minimal amount of processing in the order of a few microseconds. When the processing time of requests is (a) longer or (b) follows a distribution with higher dispersion, the transient load imbalances and head-of-line blocking deteriorate the performance of the dataplane. We claim that it is possible to introduce a scheduler to dataplanes, which routes requests to the appropriate core and effectively reduce the tail latency of the system while at the same time support a wider range of workloads

    Some aspects of traffic control and performance evaluation of ATM networks

    Get PDF
    The emerging high-speed Asynchronous Transfer Mode (ATM) networks are expected to integrate through statistical multiplexing large numbers of traffic sources having a broad range of statistical characteristics and different Quality of Service (QOS) requirements. To achieve high utilisation of network resources while maintaining the QOS, efficient traffic management strategies have to be developed. This thesis considers the problem of traffic control for ATM networks. The thesis studies the application of neural networks to various ATM traffic control issues such as feedback congestion control, traffic characterization, bandwidth estimation, and Call Admission Control (CAC). A novel adaptive congestion control approach based on a neural network that uses reinforcement learning is developed. It is shown that the neural controller is very effective in providing general QOS control. A Finite Impulse Response (FIR) neural network is proposed to adaptively predict the traffic arrival process by learning the relationship between the past and future traffic variations. On the basis of this prediction, a feedback flow control scheme at input access nodes of the network is presented. Simulation results demonstrate significant performance improvement over conventional control mechanisms. In addition, an accurate yet computationally efficient approach to effective bandwidth estimation for multiplexed connections is investigated. In this method, a feed forward neural network is employed to model the nonlinear relationship between the effective bandwidth and the traffic situations and a QOS measure. Applications of this approach to admission control, bandwidth allocation and dynamic routing are also discussed. A detailed investigation has indicated that CAC schemes based on effective bandwidth approximation can be very conservative and prevent optimal use of network resources. A modified effective bandwidth CAC approach is therefore proposed to overcome the drawback of conventional methods. Considering statistical multiplexing between traffic sources, we directly calculate the effective bandwidth of the aggregate traffic which is modelled by a two-state Markov modulated Poisson process via matching four important statistics. We use the theory of large deviations to provide a unified description of effective bandwidths for various traffic sources and the associated ATM multiplexer queueing performance approximations, illustrating their strengths and limitations. In addition, a more accurate estimation method for ATM QOS parameters based on the Bahadur-Rao theorem is proposed, which is a refinement of the original effective bandwidth approximation and can lead to higher link utilisation
    • …