Search CORE

264 research outputs found

Technical Report: A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters

Author: Iosup Alexandru
Neacşu Mihai
Versluis Laurens
Publication venue
Publication date: 24/11/2017
Field of study

To improve customer experience, datacenter operators offer support for simplifying application and resource management. For example, running workloads of workflows on behalf of customers is desirable, but requires increasingly more sophisticated autoscaling policies, that is, policies that dynamically provision resources for the customer. Although selecting and tuning autoscaling policies is a challenging task for datacenter operators, so far relatively few studies investigate the performance of autoscaling for workloads of workflows. Complementing previous knowledge, in this work we propose the first comprehensive performance study in the field. Using trace-based simulation, we compare state-of-the-art autoscaling policies across multiple application domains, workload arrival patterns (e.g., burstiness), and system utilization levels. We further investigate the interplay between autoscaling and regular allocation policies, and the complexity cost of autoscaling. Our quantitative study focuses not only on traditional performance metrics and on state-of-the-art elasticity metrics, but also on time- and memory-related autoscaling-complexity metrics. Our main results give strong and quantitative evidence about previously unreported operational behavior, for example, that autoscaling policies perform differently across application domains and by how much they differ.Comment: Technical Report for the CCGrid 2018 submission "A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters

arXiv.org e-Print Archive

VU Research Portal

Crossref

Autoscaling Method for Docker Swarm Towards Bursty Workload

Author: Ding Zhijun
Huang Qichen
Wang Song
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 31/01/2024
Field of study

The autoscaling mechanism of cloud computing can automatically adjust computing resources according to user needs, improve quality of service (QoS) and avoid over-provision. However, the traditional autoscaling methods suffer from oscillation and degradation of QoS when dealing with burstiness. Therefore, the autoscaling algorithm should consider the effect of bursty workloads. In this paper, we propose a novel AmRP (an autoscaling method that combines reactive and proactive mechanisms) that uses proactive scaling to launch some containers in advance, and then the reactive module performs vertical scaling based on existing containers to increase resources rapidly. Our method also integrates burst detection to alleviate the oscillation of the scaling algorithm and improve the QoS. Finally, we evaluated our approach with state-of-the-art baseline scaling methods under different workloads in a Docker Swarm cluster. Compared with the baseline methods, the experimental results show that AmRP has fewer SLA violations when dealing with bursty workloads, and its resource cost is also lower

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Burst-aware predictive autoscaling for containerized microservices

Author: Abdullah Muhammad
Berral García Josep Lluís
Carrera Pérez David
Iqbal Waheed
Polo Bardés Jorda
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2022
Field of study

Autoscaling methods are used for cloud-hosted applications to dynamically scale the allocated resources for guaranteeing Quality-of-Service (QoS). The public-facing application serves dynamic workloads, which contain bursts and pose challenges for autoscaling methods to ensure application performance. Existing State-of-the-art autoscaling methods are burst-oblivious to determine and provision the appropriate resources. For dynamic workloads, it is hard to detect and handle bursts online for maintaining application performance. In this article, we propose a novel burst-aware autoscaling method which detects burst in dynamic workloads using workload forecasting, resource prediction, and scaling decision making while minimizing response time service-level objectives (SLO) violations. We evaluated our approach through a trace-driven simulation, using multiple synthetic and realistic bursty workloads for containerized microservices, improving performance when comparing against existing state-of-the-art autoscaling methods. Such experiments show an increase of × 1.09 in total processed requests, a reduction of × 5.17 for SLO violations, and an increase of × 0.767 cost as compared to the baseline method.This work was partially supported by the European Research Council (ERC) under the EU Horizon 2020 programme (GA 639595), the Spanish Ministry of Economy, Industry and Competitiveness (TIN2015-65316-P and IJCI2016-27485) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

How does it function? Characterizing long-term trends in production serverless workloads

Author: Asenov Martin
Barker Adam
Darlow Luke
Hassan Ahmed
Joosen Artjom
Singh Rajkarn
Wang Jianfeng
Publication venue: ACM
Publication date: 30/10/2023
Field of study

This paper releases and analyzes two new Huawei cloud serverless traces. The traces span a period of over 7 months with over 1.4 trillion function invocations combined. The first trace is derived from Huawei's internal workloads and contains detailed per-second statistics for 200 functions running across multiple Huawei cloud data centers. The second trace is a representative workload from Huawei's public FaaS platform. This trace contains per-minute arrival rates for over 5000 functions running in a single Huawei data center. We present the internals of a production FaaS platform by characterizing resource consumption, cold-start times, programming languages used, periodicity, per-second versus per-minute burstiness, correlations, and popularity. Our findings show that there is considerable diversity in how serverless functions behave: requests vary by up to 9 orders of magnitude across functions, with some functions executed over 1 billion times per day; scheduling time, execution time and cold-start distributions vary across 2 to 4 orders of magnitude and have very long tails; and function invocation counts demonstrate strong periodicity for many individual functions and on an aggregate level. Our analysis also highlights the need for further research in estimating resource reservations and time-series prediction to account for the huge diversity in how serverless functions behave.Postprin

University of St. Andrews - Pure

St Andrews Research Repository

Model-based analytics for profiling workloads in virtual network function

Author: Bruschi R.
Davoli F.
Lago P.
Pajo J. F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Università di Genova

Recommended from our members

Measuring Burstiness in Data Center Applications

Author: Jackson Woodruff
Moore Andrew W
Zilberman Noa
Publication venue: BS '19: Proceedings of the 2019 Workshop on Buffer Sizing
Publication date: 01/01/2019
Field of study

Buffer sizing is a tricky task --- it depends on a large number of variables, ranging from congestion control to traffic engineering. Still, the most unpredictable contributors are the workloads running in the network. The link utilization and burstiness of these workloads dictate the buffer depth needed by a switch. But what is a burst? Do traditional definitions still apply in the age in which switches transfer terabits of data and billions of packets every second? Unless we assess bursts correctly, we are unlikely to size buffers appropriately. In this work, we present a measurement-led evaluation of the burstiness of different data center applications. We address the question of ``what is a burst?'' and assert that common techniques cannot answer this question in modern data centers. We quantify the change in burstiness of the studied applications across multiple vectors, including latency and network perspective, and generalize our results to the common case. Our observations can inform future buffer sizing efforts and guide switch configurations. Our dataset is openly available for the benefit of the community.Leverhulme Trust Isaac Newton Trus

Apollo (Cambridge)

Performance modelling with adaptive hidden Markov models and discriminatory processor sharing queues

Author: Chis Tiberiu
Publication venue: Computing, Imperial College London
Publication date: 01/08/2016
Field of study

In modern computer systems, workload varies at different times and locations. It is important to model the performance of such systems via workload models that are both representative and efficient. For example, model-generated workloads represent realistic system behaviour, especially during peak times, when it is crucial to predict and address performance bottlenecks. In this thesis, we model performance, namely throughput and delay, using adaptive models and discrete queues. Hidden Markov models (HMMs) parsimoniously capture the correlation and burstiness of workloads with spatiotemporal characteristics. By adapting the batch training of standard HMMs to incremental learning, online HMMs act as benchmarks on workloads obtained from live systems (i.e. storage systems and financial markets) and reduce time complexity of the Baum-Welch algorithm. Similarly, by extending HMM capabilities to train on multiple traces simultaneously it follows that workloads of different types are modelled in parallel by a multi-input HMM. Typically, the HMM-generated traces verify the throughput and burstiness of the real data. Applications of adaptive HMMs include predicting user behaviour in social networks and performance-energy measurements in smartphone applications. Equally important is measuring system delay through response times. For example, workloads such as Internet traffic arriving at routers are affected by queueing delays. To meet quality of service needs, queueing delays must be minimised and, hence, it is important to model and predict such queueing delays in an efficient and cost-effective manner. Therefore, we propose a class of discrete, processor-sharing queues for approximating queueing delay as response time distributions, which represent service level agreements at specific spatiotemporal levels. We adapt discrete queues to model job arrivals with distributions given by a Markov-modulated Poisson process (MMPP) and served under discriminatory processor-sharing scheduling. Further, we propose a dynamic strategy of service allocation to minimise delays in UDP traffic flows whilst maximising a utility function.Open Acces

Spiral - Imperial College Digital Repository