346 research outputs found

    Resource dimensioning in a mixed traffic environment

    Get PDF
    An important goal of modern data networks is to support multiple applications over a single network infrastructure. The combination of data, voice, video and conference traffic, each requiring a unique Quality of Service (QoS), makes resource dimensioning a very challenging task. To guarantee QoS by mere over-provisioning of bandwidth is not viable in the long run, as network resources are expensive. The aim of proper resource dimensioning is to provide the required QoS while making optimal use of the allocated bandwidth. Dimensioning parameters used by service providers today are based on best practice recommendations, and are not necessarily optimal. This dissertation focuses on resource dimensioning for the DiffServ network architecture. Four predefined traffic classes, i.e. Real Time (RT), Interactive Business (IB), Bulk Business (BB) and General Data (GD), needed to be dimensioned in terms of bandwidth allocation and traffic regulation. To perform this task, a study was made of the DiffServ mechanism and the QoS requirements of each class. Traffic generators were required for each class to perform simulations. Our investigations show that the dominating Transport Layer protocol for the RT class is UDP, while TCP is mostly used by the other classes. This led to a separate analysis and requirement for traffic models for UDP and TCP traffic. Analysis of real-world data shows that modern network traffic is characterized by long-range dependency, self-similarity and a very bursty nature. Our evaluation of various traffic models indicates that the Multi-fractal Wavelet Model (MWM) is best for TCP due to its ability to capture long-range dependency and self-similarity. The Markov Modulated Poisson Process (MMPP) is able to model occasional long OFF-periods and burstiness present in UDP traffic. Hence, these two models were used in simulations. A test bed was implemented to evaluate performance of the four traffic classes defined in DiffServ. Traffic was sent through the test bed, while delay and loss was measured. For single class simulations, dimensioning values were obtained while conforming to the QoS specifications. Multi-class simulations investigated the effects of statistical multiplexing on the obtained values. Simulation results for various numerical provisioning factors (PF) were obtained. These factors are used to determine the link data rate as a function of the required average bandwidth and QoS. The use of class-based differentiation for QoS showed that strict delay and loss bounds can be guaranteed, even in the presence of very high (up to 90%) bandwidth utilization. Simulation results showed small deviations from best practice recommendation PF values: A value of 4 is currently used for both RT and IB classes, while 2 is used for the BB class. This dissertation indicates that 3.89 for RT, 3.81 for IB and 2.48 for BB achieve the prescribed QoS more accurately. It was concluded that either the bandwidth distribution among classes, or quality guarantees for the BB class should be adjusted since the RT and IB classes over-performed while BB under-performed. The results contribute to the process of resource dimensioning by adding value to dimensioning parameters through simulation rather than mere intuition or educated guessing.Dissertation (MEng (Electronic Engineering))--University of Pretoria, 2007.Electrical, Electronic and Computer Engineeringunrestricte

    Low-Impact Profiling of Streaming, Heterogeneous Applications

    Get PDF
    Computer engineers are continually faced with the task of translating improvements in fabrication process technology: i.e., Moore\u27s Law) into architectures that allow computer scientists to accelerate application performance. As feature-size continues to shrink, architects of commodity processors are designing increasingly more cores on a chip. While additional cores can operate independently with some tasks: e.g. the OS and user tasks), many applications see little to no improvement from adding more processor cores alone. For many applications, heterogeneous systems offer a path toward higher performance. Significant performance and power gains have been realized by combining specialized processors: e.g., Field-Programmable Gate Arrays, Graphics Processing Units) with general purpose multi-core processors. Heterogeneous applications need to be programmed differently than traditional software. One approach, stream processing, fits these systems particularly well because of the segmented memories and explicit expression of parallelism. Unfortunately, debugging and performance tools that support streaming, heterogeneous applications do not exist. This dissertation presents TimeTrial, a performance measurement system that enables performance optimization of streaming applications by profiling the application deployed on a heterogeneous system. TimeTrial performs low-impact measurements by dedicating computing resources to monitoring and by aggressively compressing performance traces into statistical summaries guided by user specification of the performance queries of interest

    Workload characterization, modeling, and prediction in grid Computing

    Get PDF
    Workloads play an important role in experimental performance studies of computer systems. This thesis presents a comprehensive characterization of real workloads on production clusters and Grids. A variety of correlation structures and rich scaling behavior are identified in workload attributes such as job arrivals and run times, including pseudo-periodicity, long range dependence, and strong temporal locality. Based on the analytic results workload models are developed to fit the real data. For job arrivals three different kinds of autocorrelations are investigated. For short to middle range dependent data, Markov modulated Poisson processes (MMPP) are good models because they can capture correlations between interarrival times while remaining analytically tractable. For long range dependent and multifractal processes, the multifractal wavelet model (MWM) is able to reconstruct the scaling behavior and it provides a coherent wavelet framework for analysis and synthesis. Pseudo-periodicity is a special kind of autocorrelation and it can be modeled by a matching pursuit approach. For workload attributes such as run time a new model is proposed that can fit not only the marginal distribution but also the second order statistics such as the autocorrelation function (ACF). The development of workload models enable the simulation studies of Grid scheduling strategies. By using the synthetic traces, the performance impacts of workload correlations in Grid scheduling is quantitatively evaluated. The results indicate that autocorrelations in workload attributes can cause performance degradation, in some situations the difference can be up to several orders of magnitude. The larger the autocorrelation, the worse the performance, it is proved both at the cluster and Grid level. This study shows the importance of realistic workload models in performance evaluation studies. Regarding performance predictions, this thesis treats the targeted resources as a ``black box'' and takes a statistical approach. It is shown that statistical learning based methods, after a well-thought and fine-tuned design, are able to deliver good accuracy and performance.UBL - phd migration 201

    Weibull mixture model to characterise end-to-end Internet delay at coarse time-scales

    Get PDF
    Traces collected at monitored points around the Internet contain representative performance information about the paths their probes traverse. Basic measurement attributes, such as delay and loss, are easy to collect and provide a means to both build and validate empirical performance models. However, the task of analysis and extracting performance conclusions from measurements remains challenging. Ideally, performance modelling aims to find a set of self-contained parameters to describe, summarise, profile and easy display network performance status at a time. This can result in the provision of meaningful information to address applications in fault and performance management, hence providing input to network provisioning, traffic engineering and performance prediction. In this work we present the Weibull Mixture Model, a method to characterise endto- end network delay measurements within a few simple, accurate, representative and handleable parameters using a finite combination of Weibull distributions, with all the aforementioned benefits. The model parameters are related tomeaningful delay characteristics, such as average peak and tail behaviour in a daily profile, and can be optimally found using an iterative algorithm known as Expectation Maximisation. Studies on such parameter evolution can reflect current workload status and all possible network events impacting packet dynamics, with further applications in network management. In addition, a self-sufficient procedure to implement the Weibull Mixture Model is presented, along with a set of matching examples to real GPS synchronised measurements taken across the Internet, donated by RIPE NCC

    A flow-based model for Internet backbone traffic

    Get PDF
    We model traffic on an uncongested backbone link of an IP network using Poisson Shot-noise process and M/G/∞\infty queue. We validate the model by simulation. We analyze the model accuracy with real traffic traces collected on the Sprint IP backbone network. We show that despite its simplicity, our model provides a good approximation of the real traffic observed on OC-3 links. This model is also very easy to use and requires few simple parameters to be input

    Dynamical Modeling of Cloud Applications for Runtime Performance Management

    Get PDF
    Cloud computing has quickly grown to become an essential component in many modern-day software applications. It allows consumers, such as a provider of some web service, to quickly and on demand obtain the necessary computational resources to run their applications. It is desirable for these service providers to keep the running cost of their cloud application low while adhering to various performance constraints. This is made difficult due to the dynamics imposed by, e.g., resource contentions or changing arrival rate of users, and the fact that there exist multiple ways of influencing the performance of a running cloud application. To facilitate decision making in this environment, performance models can be introduced that relate the workload and different actions to important performance metrics.In this thesis, such performance models of cloud applications are studied. In particular, we focus on modeling using queueing theory and on the fluid model for approximating the often intractable dynamics of the queue lengths. First, existing results on how the fluid model can be obtained from the mean-field approximation of a closed queueing network are simplified and extended to allow for mixed networks. The queues are allowed to follow the processor sharing or delay disciplines, and can have multiple classes with phase-type service times. An improvement to this fluid model is then presented to increase accuracy when the \emph{system size}, i.e., number of servers, initial population, and arrival rate, is small. Furthermore, a closed-form approximation of the response time CDF is presented. The methods are tested in a series of simulation experiments and shown to be accurate. This mean-field fluid model is then used to derive a general fluid model for microservices with interservice delays. The model is shown to be completely extractable at runtime in a distributed fashion. It is further evaluated on a simple microservice application and found to accurately predict important performance metrics in most cases. Furthermore, a method is devised to reduce the cost of a running application by tuning load balancing parameters between replicas. The method is built on gradient stepping by applying automatic differentiation to the fluid model. This allows for arbitrarily defined cost functions and constraints, most notably including different response time percentiles. The method is tested on a simple application distributed over multiple computing clusters and is shown to reduce costs while adhering to percentile constraints. Finally, modeling of request cloning is studied using the novel concept of synchronized service. This allows certain forms of cloning over servers, each modeled with a single queue, to be equivalently expressed as one single queue. The concept is very general regarding the involved queueing discipline and distributions, but instead introduces new, less realistic assumptions. How the equivalent queue model is affected by relaxing these assumptions is studied considering the processor sharing discipline, and an extension to enable modeling of speculative execution is made. In a simulation campaign, it is shown that these relaxations only has a minor effect in certain cases

    Dependence-driven techniques in system design

    Get PDF
    Burstiness in workloads is often found in multi-tier architectures, storage systems, and communication networks. This feature is extremely important in system design because it can significantly degrade system performance and availability. This dissertation focuses on how to use knowledge of burstiness to develop new techniques and tools for performance prediction, scheduling, and resource allocation under bursty workload conditions.;For multi-tier enterprise systems, burstiness in the service times is catastrophic for performance. Via detailed experimentation, we identify the cause of performance degradation on the persistent bottleneck switch among various servers. This results in an unstable behavior that cannot be captured by existing capacity planning models. In this dissertation, beyond identifying the cause and effects of bottleneck switch in multi-tier systems, we also propose modifications to the classic TPC-W benchmark to emulate bursty arrivals in multi-tier systems.;This dissertation also demonstrates how burstiness can be used to improve system performance. Two dependence-driven scheduling policies, SWAP and ALoC, are developed. These general scheduling policies counteract burstiness in workloads and maintain high availability by delaying selected requests that contribute to burstiness. Extensive experiments show that both SWAP and ALoC achieve good estimates of service times based on the knowledge of burstiness in the service process. as a result, SWAP successfully approximates the shortest job first (SJF) scheduling without requiring a priori information of job service times. ALoC adaptively controls system load by infinitely delaying only a small fraction of the incoming requests.;The knowledge of burstiness can also be used to forecast the length of idle intervals in storage systems. In practice, background activities are scheduled during system idle times. The scheduling of background jobs is crucial in terms of the performance degradation of foreground jobs and the utilization of idle times. In this dissertation, new background scheduling schemes are designed to determine when and for how long idle times can be used for serving background jobs, without violating predefined performance targets of foreground jobs. Extensive trace-driven simulation results illustrate that the proposed schemes are effective and robust in a wide range of system conditions. Furthermore, if there is burstiness within idle times, then maintenance features like disk scrubbing and intra-disk data redundancy can be successfully scheduled as background activities during idle times
    • …
    corecore