134 research outputs found

    Stochastic methods for measurement-based network control

    Get PDF
    The main task of network administrators is to ensure that their network functions properly. Whether they manage a telecommunication or a road network, they generally base their decisions on the analysis of measurement data. Inspired by such network control applications, this dissertation investigates several stochastic modelling techniques for data analysis. The focus is on two areas within the field of stochastic processes: change point detection and queueing theory. Part I deals with statistical methods for the automatic detection of change points, being changes in the probability distribution underlying a data sequence. This part starts with a review of existing change point detection methods for data sequences consisting of independent observations. The main contribution of this part is the generalisation of the classic cusum method to account for dependence within data sequences. We analyse the false alarm probability of the resulting methods using a large deviations approach. The part also discusses numerical tests of the new methods and a cyber attack detection application, in which we investigate how to detect dns tunnels. The main contribution of Part II is the application of queueing models (probabilistic models for waiting lines) to situations in which the system to be controlled can only be observed partially. We consider two types of partial information. Firstly, we develop a procedure to get insight into the performance of queueing systems between consecutive system-state measurements and apply it in a numerical study, which was motivated by capacity management in cable access networks. Secondly, inspired by dynamic road control applications, we study routing policies in a queueing system for which just part of the jobs are observable and controllable

    A mean field model of work stealing in large-scale systems

    Get PDF
    In this paper, we consider a generic model of computational grids, seen as several clusters of homogeneous processors. In such systems, a key issue when designing ecient job allocation policies is to balance the workload over the dierent resources. We present a Markovian model for performance evaluation of such a policy, namely work stealing (idle processors steal work from others) in large-scale heterogeneous systems. Using mean eld theory, we show that when the size of the system grows, it converges to a system of deterministic ordinary dierential equations that allows one to compute the expectation of performance functions (such as average response times) as well as the distributions of these functions. We first study the case where all resources are homogeneous, showing in particular that work stealing is very efficient, even when the latency of steals is large. We also consider the case where distance plays a role: the system is made of several clusters, and stealing within one cluster is faster than stealing between clusters. We compare dierent work stealing policies, based on stealing probabilities and we show that the main factor for deciding where to steal from is the load rather than the stealing latenc

    Datacenter Architectures for the Microservices Era

    Full text link
    Modern internet services are shifting away from single-binary, monolithic services into numerous loosely-coupled microservices that interact via Remote Procedure Calls (RPCs), to improve programmability, reliability, manageability, and scalability of cloud services. Computer system designers are faced with many new challenges with microservice-based architectures, as individual RPCs/tasks are only a few microseconds in most microservices. In this dissertation, I seek to address the most notable challenges that arise due to the dissimilarities of the modern microservice based and classic monolithic cloud services, and design novel server architectures and runtime systems that enable efficient execution of µs-scale microservices on modern hardware. In the first part of my dissertation, I seek to address the problem of Killer Microseconds, which refers to µs-scale “holes” in CPU schedules caused by stalls to access fast I/O devices or brief idle times between requests in high throughput µs-scale microservices. Whereas modern computing platforms can efficiently hide ns-scale and ms-scale stalls through micro-architectural techniques and OS context switching, they lack efficient support to hide the latency of µs-scale stalls. In chapter II, I propose Duplexity, a heterogeneous server architecture that employs aggressive multithreading to hide the latency of killer microseconds, without sacrificing the Quality-of-Service (QoS) of latency-sensitive microservices. Duplexity is able to achieve 1.9× higher core utilization and 2.7× lower iso-throughput 99th-percentile tail latency over an SMT-based server design, on average. In chapters III-IV, I comprehensively investigate the problem of tail latency in the context of microservices and address multiple aspects of it. First, in chapter III, I characterize the tail latency behavior of microservices and provide general guidelines for optimizing computer systems from a queuing perspective to minimize tail latency. Queuing is a major contributor to end-to-end tail latency, wherein nominal tasks are enqueued behind rare, long ones, due to Head-of-Line (HoL) blocking. Next, in chapter IV, I introduce Q-Zilla, a scheduling framework to tackle tail latency from a queuing perspective, and CoreZilla, a microarchitectural instantiation of the framework. Q-Zilla is composed of the ServerQueue Decoupled Size-Interval Task Assignment (SQD-SITA) scheduling algorithm and the Express-lane Simultaneous Multithreading (ESMT) microarchitecture, which together seek to address HoL blocking by providing an “express-lane” for short tasks, protecting them from queuing behind rare, long ones. By combining the ESMT microarchitecture and the SQD-SITA scheduling algorithm, CoreZilla is able to improves tail latency over a conventional SMT core with 2, 4, and 8 contexts by 2.25×, 3.23×, and 4.38×, on average, respectively, and outperform a theoretical 32-core scale-up organization by 12%, on average, with 8 contexts. Finally, in chapters V-VI, I investigate the tail latency problem of microservices from a cluster, rather than server-level, perspective. Whereas Service Level Objectives (SLOs) define end-to-end latency targets for the entire service to ensure user satisfaction, with microservice-based applications, it is unclear how to scale individual microservices when end-to-end SLOs are violated or underutilized. I introduce Parslo as an analytical framework for partial SLO allocation in virtualized cloud microservices. Parslo takes a microservice graph as an input and employs a Gradient Descent-based approach to allocate “partial SLOs” to different microservice nodes, enabling independent auto-scaling of individual microservices. Parslo achieves the optimal solution, minimizing the total cost for the entire service deployment, and is applicable to general microservice graphs.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/167978/1/miramir_1.pd

    Discrete Time Analysis of Consolidated Transport Processes

    Get PDF
    Diese Arbeit beschäftigt sich mit der Entwicklung zeitdiskreter Modelle zur Analyse von Transportbündelungen. Mit den entwickelten Modellen für Bestands- und Fahrzeugbündelungen, insbesondere Milkrun-Systeme, kann eine detaillierte Leistungsbewertung in kurzer Zeit durchgeführt werden. Darüber hinaus erlauben die Modelle die Analyse der Umschlagslagerbündelungen, beispielweise Hub-und-Spoke-Netzwerke, indem sie im Rahmen einer Netzwerkanalyse mit einander verknüpft werden

    Load Balancing of Elastic Data Traffic in Heterogeneous Wireless Networks

    Get PDF
    The increasing amount of mobile data traffic has resulted in an architectural innovation in cellular networks through the introduction of heterogeneous networks. In heterogeneous networks, the deployment of macrocells is accompanied by the use of low power pico and femtocells (referred to as microcells) in hot spot areas inside the macrocell which increase the data rate per unit area. The purpose of this thesis is to study the load balancing problem of elastic data traffic in heterogeneous wireless networks. These networks consist of different types of cells with different characteristics. Individual cells are modelled as an M/G/1 - PS queueing system. This results in a multi-server queueing model consisting of a single macrocell with multiple microcells within the area. Both static and dynamic load balancing schemes are developed to balance the data flows between the macrocell and microcells so that the mean flow-level delay is minimized. Both analytical and numerical methods are used for static policies. For dynamic policies, the performance is evaluated by simulations. The results of the study indicate that all dynamic policies can significantly improve the flow-level delay performance in the system under consideration compared to the optimal static policy. The results also indicate that MJSQ and MP are best policies although MJSQ needs less state information. The performance gain of most of the dynamic polices is insensitive with respect to the flow size distribution. In addition, many interesting tests are conducted such as the effect of increasing the number of microcells and the impact of service rate difference between macrocell and microcells

    Seeing through black boxes: Tracking transactions through queues under monitoring resource constraints

    Get PDF
    The problem of optimal allocation of monitoring resources for tracking transactions progressing through a distributed system, modeled as a queueing network, is considered. Two forms of monitoring information are considered, viz., locally unique transaction identifiers, and arrival and departure timestamps of transactions at each processing queue. The timestamps are assumed to be available at all the queues but in the absence of identifiers, only enable imprecise tracking since parallel processing can result in out-of-order departures. On the other hand, identifiers enable precise tracking but are not available without proper instrumentation. Given an instrumentation budget, only a subset of queues can be selected for the production of identifiers, while the remaining queues have to resort to imprecise tracking using timestamps. The goal is then to optimally allocate the instrumentation budget to maximize the overall tracking accuracy. The challenge is that the optimal allocation strategy depends on accuracies of timestamp-based tracking at different queues, which has complex dependencies on the arrival and service processes, and the queueing discipline. We propose two simple heuristics for allocation by predicting the order of timestamp-based tracking accuracies of different queues. We derive sufficient conditions for these heuristics to achieve optimality through the notion of the stochastic comparison of queues. Simulations show that our heuristics are close to optimality, even when the parameters deviate from these conditions

    Stochastic Models for Order Picking Systems

    Get PDF

    Stochastic Models for Order Picking Systems

    Get PDF
    • …
    corecore