57 research outputs found

    Aggregate matrix-analytic techniques and their applications

    Get PDF
    The complexity of computer systems affects the complexity of modeling techniques that can be used for their performance analysis. In this dissertation, we develop a set of techniques that are based on tractable analytic models and enable efficient performance analysis of computer systems. Our approach is three pronged: first, we propose new techniques to parameterize measurement data with Markovian-based stochastic processes that can be further used as input into queueing systems; second, we propose new methods to efficiently solve complex queueing models; and third, we use the proposed methods to evaluate the performance of clustered Web servers and propose new load balancing policies based on this analysis.;We devise two new techniques for fitting measurement data that exhibit high variability into Phase-type (PH) distributions. These techniques apply known fitting algorithms in a divide-and-conquer fashion. We evaluate the accuracy of our methods from both the statistics and the queueing systems perspective. In addition, we propose a new methodology for fitting measurement data that exhibit long-range dependence into Markovian Arrival Processes (MAPs).;We propose a new methodology, ETAQA, for the exact solution of M/G/1-type processes, (GI/M/1-type processes, and their intersection, i.e., quasi birth-death (QBD) processes. ETAQA computes an aggregate steady state probability distribution and a set of measures of interest. E TAQA is numerically stable and computationally superior to alternative solution methods. Apart from ETAQA, we propose a new methodology for the exact solution of a class of GI/G/1-type processes based on aggregation/decomposition.;Finally, we demonstrate the applicability of the proposed techniques by evaluating load balancing policies in clustered Web servers. We address the high variability in the service process of Web servers by dedicating the servers of a cluster to requests of similar sizes and propose new, content-aware load balancing policies. Detailed analysis shows that the proposed policies achieve high user-perceived performance and, by continuously adapting their scheduling parameters to the current workload characteristics, provide good performance under conditions of transient overload

    Resource management of replicated service systems provisioned in the cloud

    Get PDF
    Service providers seek scalable and cost-effective cloud solutions for hosting their applications. Despite significant recent advances facilitating the deployment and management of services on cloud platforms, a number of challenges still remain. Service providers are confronted with time-varying requests for the provided applications, inter- dependencies between different components, performance variability of the procured virtual resources, and cost structures that differ from conventional data centers. Moreover, fulfilling service level agreements, such as the throughput and response time percentiles, becomes of paramount importance for ensuring business advantages.In this thesis, we explore service provisioning in clouds from multiple points of view. The aim is to best provide service replicas in the form of VMs to various service applications, such that their tail throughput and tail response times, as well as resource utilization, meet the service level agreements in the most cost effective manner. In particular, we develop models, algorithms and replication strategies that consider multi-tier composed services provisioned in clouds. We also investigate how a service provider can opportunistically take advantage of observed performance variability in the cloud. Finally, we provide means of guaranteeing tail throughput and response times in the face of performance variability of VMs, using Markov chain modeling and large deviation theory. We employ methods from analytical modeling, event-driven simulations and experiments. Overall, this thesis provides not only a multi-faceted approach to exploring several crucial aspects of hosting services in clouds, i.e., cost, tail throughput, and tail response times, but our proposed resource management strategies are also rigorously validated via trace-driven simulation and extensive experiment

    Effective task assignment strategies for distributed systems under highly variable workloads

    Get PDF
    Heavy-tailed workload distributions are commonly experienced in many areas of distributed computing. Such workloads are highly variable, where a small number of very large tasks make up a large proportion of the workload, making the load very hard to distribute effectively. Traditional task assignment policies are ineffective under these conditions as they were formulated based on the assumption of an exponentially distributed workload. Size-based task assignment policies have been proposed to handle heavy-tailed workloads, but their applications are limited by their static nature and assumption of prior knowledge of a task's service requirement. This thesis analyses existing approaches to load distribution under heavy-tailed workloads, and presents a new generalised task assignment policy that significantly improves performance for many distributed applications, by intelligently addressing the negative effects on performance that highly variable workloads cause. Many problems associated with the modelling and optimisations of systems under highly variable workloads were then addressed by a novel technique that approximated these workloads with simpler mathematical representations, without losing any of their pertinent original properties. Finally, we obtain advance queuing metrics (such as the variance of key measurements like waiting time and slowdown that are difficult to obtain analytically) through rigorous simulation

    Design for dependability: A simulation-based approach

    Get PDF
    This research addresses issues in simulation-based system level dependability analysis of fault-tolerant computer systems. The issues and difficulties of providing a general simulation-based approach for system level analysis are discussed and a methodology that address and tackle these issues is presented. The proposed methodology is designed to permit the study of a wide variety of architectures under various fault conditions. It permits detailed functional modeling of architectural features such as sparing policies, repair schemes, routing algorithms as well as other fault-tolerant mechanisms, and it allows the execution of actual application software. One key benefit of this approach is that the behavior of a system under faults does not have to be pre-defined as it is normally done. Instead, a system can be simulated in detail and injected with faults to determine its failure modes. The thesis describes how object-oriented design is used to incorporate this methodology into a general purpose design and fault injection package called DEPEND. A software model is presented that uses abstractions of application programs to study the behavior and effect of software on hardware faults in the early design stage when actual code is not available. Finally, an acceleration technique that combines hierarchical simulation, time acceleration algorithms and hybrid simulation to reduce simulation time is introduced

    Planning and Routing Algorithms for Multi-Skill Contact Centers

    Get PDF
    Koole, G.M. [Promotor

    Analysis of Bandwidth and Latency Constraints on a Packetized Cloud Radio Access Network Fronthaul

    Get PDF
    Cloud radio access network (C-RAN) is a promising architecture for the next-generation RAN to meet the diverse and stringent requirements envisioned by fifth generation mobile communication systems (5G) and future generation mobile networks. C-RAN offers several advantages, such as reduced capital expenditure (CAPEX) and operational expenditure (OPEX), increased spectral efficiency (SE), higher capacity and improved cell-edge performance, and efficient hardware utilization through resource sharing and network function virtualization (NFV). However, these centralization gains come with the need for a fronthaul, which is the transport link connecting remote radio units (RRUs) to the base band unit (BBU) pool. In conventional C-RAN, legacy common public radio interface (CPRI) protocol is used on the fronthaul network to transport the raw, unprocessed baseband in-phase/quadrature-phase (I/Q) samples between the BBU and the RRUs, and it demands a huge fronthaul bandwidth, a strict low-latency, in the order of a few hundred microseconds, and a very high reliability. Hence, in order to relax the excessive fronthaul bandwidth and stringent low-latency requirements, as well as to enhance the flexibility of the fronthaul, it is utmost important to redesign the fronthaul, while still profiting from the acclaimed centralization benefits. Therefore, a flexibly centralized C-RAN with different functional splits has been introduced. In addition, 5G mobile fronthaul (often also termed as an evolved fronthaul ) is envisioned to be packet-based, utilizing the Ethernet as a transport technology. In this thesis, to circumvent the fronthaul bandwidth constraint, a packetized fronthaul considering an appropriate functional split such that the fronthaul data rate is coupled with actual user data rate, unlike the classical C-RAN where fronthaul data rate is always static and independent of the traffic load, is justifiably chosen. We adapt queuing and spatial traffic models to derive the mathematical expressions for statistical multiplexing gains that can be obtained from the randomness in the user traffic. Through this, we show that the required fronthaul bandwidth can be reduced significantly, depending on the overall traffic demand, correlation distance and outage probability. Furthermore, an iterative optimization algorithm is developed, showing the impacts of number of pilots on a bandwidth-constrained fronthaul. This algorithm achieves additional reduction in the required fronthaul bandwidth. Next, knowing the multiplexing gains and possible fronthaul bandwidth reduction, it is beneficial for the mobile network operators (MNOs) to deploy the optical transceiver (TRX) modules in C-RAN cost efficiently. For this, using the same framework, a cost model for fronthaul TRX cost optimization is presented. This is essential in C-RAN, because in a wavelength division multiplexing-passive optical network (WDM-PON) system, TRXs are generally deployed to serve at a peak load. But, because of variations in the traffic demands, owing to tidal effect, the fronthaul can be dimensioned requiring a lower capacity allowing a reasonable outage, thus giving rise to cost saving by deploying fewer TRXs, and energy saving by putting the unused TRXs in sleep mode. The second focus of the thesis is the fronthaul latency analysis, which is a critical performance metric, especially for ultra-reliable and low latency communication (URLLC). An analytical framework to calculate the latency in the uplink (UL) of C-RAN massive multiple-input multiple-output (MIMO) system is presented. For this, a continuous-time queuing model for the Ethernet switch in the fronthaul network, which aggregates the UL traffic from several massive MIMO-aided RRUs, is considered. The closed-form solutions for the moment generating function (MGF) of sojourn time, waiting time and queue length distributions are derived using Pollaczek–Khinchine formula for our M/HE/1 queuing model, and evaluated via numerical solutions. In addition, the packet loss rate – due to the inability of the packets to reach the destination in a certain time – is derived. Due to the slotted nature of the UL transmissions, the model is extended to a discrete-time queuing model. The impact of the packet arrival rate, average packet size, SE of users, and fronthaul capacity on the sojourn time, waiting time and queue length distributions are analyzed. While offloading more signal processing functionalities to the RRU reduces the required fronthaul bandwidth considerably, this increases the complexity at the RRU. Hence, considering the 5G New Radio (NR) flexible numerology and XRAN functional split with a detailed radio frequency (RF) chain at the RRU, the total RRU complexity is computed first, and later, a tradeoff between the required fronthaul bandwidth and RRU complexity is analyzed. We conclude that despite the numerous C-RAN benefits, the stringent fronthaul bandwidth and latency constraints must be carefully evaluated, and an optimal functional split is essential to meet diverse set of requirements imposed by new radio access technologies (RATs).Ein cloud-basiertes Mobilfunkzugangsnetz (cloud radio access network, C-RAN) stellt eine vielversprechende Architektur für das RAN der nächsten Generation dar, um die vielfältigen und strengen Anforderungen der fünften (5G) und zukünftigen Generationen von Mobilfunknetzen zu erfüllen. C-RAN bietet mehrere Vorteile, wie z.B. reduzierte Investitions- (CAPEX) und Betriebskosten (OPEX), erhöhte spektrale Effizienz (SE), höhere Kapazität und verbesserte Leistung am Zellrand sowie effiziente Hardwareauslastung durch Ressourcenteilung und Virtualisierung von Netzwerkfunktionen (network function virtualization, NFV). Diese Zentralisierungsvorteile erfordern jedoch eine Transportverbindung (Fronthaul), die die Antenneneinheiten (remote radio units, RRUs) mit dem Pool an Basisbandeinheiten (basisband unit, BBU) verbindet. Im konventionellen C-RAN wird das bestehende CPRI-Protokoll (common public radio interface) für das Fronthaul-Netzwerk verwendet, um die rohen, unverarbeitet n Abtastwerte der In-Phaseund Quadraturkomponente (I/Q) des Basisbands zwischen der BBU und den RRUs zu transportieren. Dies erfordert eine enorme Fronthaul-Bandbreite, eine strenge niedrige Latenz in der Größenordnung von einigen hundert Mikrosekunden und eine sehr hohe Zuverlässigkeit. Um die extrem große Fronthaul-Bandbreite und die strengen Anforderungen an die geringe Latenz zu lockern und die Flexibilität des Fronthauls zu erhöhen, ist es daher äußerst wichtig, das Fronthaul neu zu gestalten und dabei trotzdem von den erwarteten Vorteilen der Zentralisierung zu profitieren. Daher wurde ein flexibel zentralisiertes CRAN mit unterschiedlichen Funktionsaufteilungen eingeführt. Außerdem ist das mobile 5G-Fronthaul (oft auch als evolved Fronthaul bezeichnet) als paketbasiert konzipiert und nutzt Ethernet als Transporttechnologie. Um die Bandbreitenbeschränkung zu erfüllen, wird in dieser Arbeit ein paketbasiertes Fronthaul unter Berücksichtigung einer geeigneten funktionalen Aufteilung so gewählt, dass die Fronthaul-Datenrate mit der tatsächlichen Nutzdatenrate gekoppelt wird, im Gegensatz zum klassischen C-RAN, bei dem die Fronthaul-Datenrate immer statisch und unabhängig von der Verkehrsbelastung ist. Wir passen Warteschlangen- und räumliche Verkehrsmodelle an, um mathematische Ausdrücke für statistische Multiplexing- Gewinne herzuleiten, die aus der Zufälligkeit im Benutzerverkehr gewonnen werden können. Hierdurch zeigen wir, dass die erforderliche Fronthaul-Bandbreite abhängig von der Gesamtverkehrsnachfrage, der Korrelationsdistanz und der Ausfallwahrscheinlichkeit deutlich reduziert werden kann. Darüber hinaus wird ein iterativer Optimierungsalgorithmus entwickelt, der die Auswirkungen der Anzahl der Piloten auf das bandbreitenbeschränkte Fronthaul zeigt. Dieser Algorithmus erreicht eine zusätzliche Reduktion der benötigte Fronthaul-Bandbreite. Mit dem Wissen über die Multiplexing-Gewinne und die mögliche Reduktion der Fronthaul-Bandbreite ist es für die Mobilfunkbetreiber (mobile network operators, MNOs) von Vorteil, die Module des optischen Sendeempfängers (transceiver, TRX) kostengünstig im C-RAN einzusetzen. Dazu wird unter Verwendung des gleichen Rahmenwerks ein Kostenmodell zur Fronthaul-TRX-Kostenoptimierung vorgestellt. Dies ist im C-RAN unerlässlich, da in einem WDM-PON-System (wavelength division multiplexing-passive optical network) die TRX im Allgemeinen bei Spitzenlast eingesetzt werden. Aufgrund der Schwankungen in den Verkehrsanforderungen (Gezeiteneffekt) kann das Fronthaul jedoch mit einer geringeren Kapazität dimensioniert werden, die einen vertretbaren Ausfall in Kauf nimmt, was zu Kosteneinsparungen durch den Einsatz von weniger TRXn und Energieeinsparungen durch den Einsatz der ungenutzten TRX im Schlafmodus führt. Der zweite Schwerpunkt der Arbeit ist die Fronthaul-Latenzanalyse, die eine kritische Leistungskennzahl liefert, insbesondere für die hochzuverlässige und niedriglatente Kommunikation (ultra-reliable low latency communications, URLLC). Ein analytisches Modell zur Berechnung der Latenz im Uplink (UL) des C-RAN mit massivem MIMO (multiple input multiple output) wird vorgestellt. Dazu wird ein Warteschlangen-Modell mit kontinuierlicher Zeit für den Ethernet-Switch im Fronthaul-Netzwerk betrachtet, das den UL-Verkehr von mehreren RRUs mit massivem MIMO aggregiert. Die geschlossenen Lösungen für die momenterzeugende Funktion (moment generating function, MGF) von Verweildauer-, Wartezeit- und Warteschlangenlängenverteilungen werden mit Hilfe der Pollaczek-Khinchin-Formel für unser M/HE/1-Warteschlangenmodell hergeleitet und mittels numerischer Verfahren ausgewertet. Darüber hinaus wird die Paketverlustrate derjenigen Pakete, die das Ziel nicht in einer bestimmten Zeit erreichen, hergeleitet. Aufgrund der Organisation der UL-Übertragungen in Zeitschlitzen wird das Modell zu einem Warteschlangenmodell mit diskreter Zeit erweitert. Der Einfluss der Paketankunftsrate, der durchschnittlichen Paketgröße, der SE der Benutzer und der Fronthaul-Kapazität auf die Verweildauer-, dieWartezeit- und dieWarteschlangenlängenverteilung wird analysiert. Während das Verlagern weiterer Signalverarbeitungsfunktionalitäten an die RRU die erforderliche Fronthaul-Bandbreite erheblich reduziert, erhöht sich dadurch im Gegenzug die Komplexität der RRU. Daher wird unter Berücksichtigung der flexiblen Numerologie von 5G New Radio (NR) und der XRAN-Funktionenaufteilung mit einer detaillierten RF-Kette (radio frequency) am RRU zunächst die gesamte RRU-Komplexität berechnet und später ein Kompromiss zwischen der erforderlichen Fronthaul-Bandbreite und der RRU-Komplexität untersucht. Wir kommen zu dem Schluss, dass trotz der zahlreichen Vorteile von C-RAN die strengen Bandbreiten- und Latenzbedingungen an das Fronthaul sorgfältig geprüft werden müssen und eine optimale funktionale Aufteilung unerlässlich ist, um die vielfältigen Anforderungen der neuen Funkzugangstechnologien (radio access technologies, RATs) zu erfüllen

    A General Framework to Compare Announcement Accuracy: Static vs LES-based Announcement

    Get PDF
    Service providers often share delay information, in the form of delay announcements, with their customers. In practice, simple delay announcements, such as average waiting times or a weighted average of previously delayed customers, are often used. Our goal in this paper is to gain insight into when such announcements perform well. Specifically, we compare the accuracies of two announcements: (i) a static announcement that does not exploit real-time information about the state of the system and (ii) a dynamic announcement, specifically the last-to-enter-service (LES) announcement, which equals the delay of the last customer to have entered service at the time of the announcement. We propose a novel correlation-based approach that is theoretically appealing because it allows for a comparison of the accuracies of announcements across different queueing models, including multiclass models with a priority service discipline. It is also practically useful because estimating correlations is much easier than fitting an entire queueing model. Using a combination of queueing-theoretic analysis, real-life data analysis, and simulation, we analyze the performance of static and dynamic announcements and derive an appropriate weighted average of the two which we demonstrate has a superior performance using both simulation and data from a call center.

    Scheduling for today’s computer systems: bridging theory and practice

    Get PDF
    Scheduling is a fundamental technique for improving performance in computer systems. From web servers to routers to operating systems, how the bottleneck device is scheduled has an enormous impact on the performance of the system as a whole. Given the immense literature studying scheduling, it is easy to think that we already understand enough about scheduling. But, modern computer system designs have highlighted a number of disconnects between traditional analytic results and the needs of system designers. In particular, the idealized policies, metrics, and models used by analytic researchers do not match the policies, metrics, and scenarios that appear in real systems. The goal of this thesis is to take a step towards modernizing the theory of scheduling in order to provide results that apply to today’s computer systems, and thus ease the burden on system designers. To accomplish this goal, we provide new results that help to bridge each of the disconnects mentioned above. We will move beyond the study of idealized policies by introducing a new analytic framework where the focus is on scheduling heuristics and techniques rather than individual policies. By moving beyond the study of individual policies, our results apply to the complex hybrid policies that are often used in practice. For example, our results enable designers to understand how the policies that favor small job sizes are affected by the fact that real systems only have estimates of job sizes. In addition, we move beyond the study of mean response time and provide results characterizing the distribution of response time and the fairness of scheduling policies. These results allow us to understand how scheduling affects QoS guarantees and whether favoring small job sizes results in large job sizes being treated unfairly. Finally, we move beyond the simplified models traditionally used in scheduling research and provide results characterizing the effectiveness of scheduling in multiserver systems and when users are interactive. These results allow us to answer questions about the how to design multiserver systems and how to choose a workload generator when evaluating new scheduling designs

    Analysis of a two-class single-server discrete-time FCFS queue : the effect of interclass correlation

    Get PDF
    In this paper, we study a discrete-time queueing system with one server and two classes of customers. Customers enter the system according to a general independent arrival process. The classes of consecutive customers, however, are correlated in a Markovian way. The system uses a global FCFS service discipline, i.e., all arriving customers are accommodated in one single FCFS queue, regardless of their classes. The service-time distribution of the customers is general but class-dependent, and therefore, the exact order in which the customers of both classes succeed each other in the arrival stream is important, which is reflected by the complexity of the system content and waiting time analysis presented in this paper. In particular, a detailed waiting time analysis of this kind of multi-class system has not yet been published, and is considered to be one of the main novelties by the authors. In addition to that, a major aim of the paper is to estimate the impact of interclass correlation in the arrival stream on the total number of customers in the system, and the customer delay. The results reveal that the system can exhibit two different classes of stochastic equilibrium: a strong equilibrium where both customer classes give rise to stable behavior individually, and a compensated equilibrium where one customer type creates overload
    • …
    corecore