445 research outputs found

    Adaptive microservice scaling for elastic applications

    Get PDF

    Technical Report: A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters

    Get PDF
    To improve customer experience, datacenter operators offer support for simplifying application and resource management. For example, running workloads of workflows on behalf of customers is desirable, but requires increasingly more sophisticated autoscaling policies, that is, policies that dynamically provision resources for the customer. Although selecting and tuning autoscaling policies is a challenging task for datacenter operators, so far relatively few studies investigate the performance of autoscaling for workloads of workflows. Complementing previous knowledge, in this work we propose the first comprehensive performance study in the field. Using trace-based simulation, we compare state-of-the-art autoscaling policies across multiple application domains, workload arrival patterns (e.g., burstiness), and system utilization levels. We further investigate the interplay between autoscaling and regular allocation policies, and the complexity cost of autoscaling. Our quantitative study focuses not only on traditional performance metrics and on state-of-the-art elasticity metrics, but also on time- and memory-related autoscaling-complexity metrics. Our main results give strong and quantitative evidence about previously unreported operational behavior, for example, that autoscaling policies perform differently across application domains and by how much they differ.Comment: Technical Report for the CCGrid 2018 submission "A Trace-Based Performance Study of Autoscaling Workloads of Workflows in Datacenters

    Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost Efficiency

    Full text link
    Autoscaling is a technology to automatically scale the resources provided to their applications without human intervention to guarantee runtime Quality of Service (QoS) while saving costs. However, user-facing cloud applications serve dynamic workloads that often exhibit variable and contain bursts, posing challenges to autoscaling for maintaining QoS within Service-Level Objectives (SLOs). Conservative strategies risk over-provisioning, while aggressive ones may cause SLO violations, making it more challenging to design effective autoscaling. This paper introduces BAScaler, a Burst-Aware Autoscaling framework for containerized cloud services or applications under complex workloads, combining multi-level machine learning (ML) techniques to mitigate SLO violations while saving costs. BAScaler incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable periodic workload spikes and actual bursts. When bursts are detected, BAScaler appropriately overestimates them and allocates resources accordingly to address the rapid growth in resource demand. On the other hand, BAScaler employs reinforcement learning to rectify potential inaccuracies in resource estimation, enabling more precise resource allocation during non-bursts. Experiments across ten real-world workloads demonstrate BAScaler's effectiveness, achieving a 57% average reduction in SLO violations and cutting resource costs by 10% compared to other prominent methods

    Theta-Scan: Leveraging behavior-driven forecasting for vertical auto-scaling in container cloud

    Get PDF
    Detection of behavior patterns on resource usage in containerized Cloud applications is necessary for proper resource provisioning. Applications can use CPU/Memory with repetitive patterns, following a trend over time independently. By identifying such patterns, resource forecasting models can be fit better, reducing over/under-provisioning via fewer resizing operations. Here we present ThetaScan, a time-series analysis method for vertical auto-scaling of containers in the Cloud, based on the detection of stationarity/trending and periodicity on resource consumption. Our method leverages the Theta Forecaster algorithm with deseasonalization that, in our provisioning scenario, only requires the estimated periodicity for resource consumption as principal hyper-parameter. Commonly used behavior detection methods require manual hyper-parameter tuning, making them infeasible for automation. Besides, it can be used at multi-scales (minute/hour/day), detecting hourly and daily patterns to improve resource usage prediction. Experiments show that we can detect behaviors in resource consumption that common methods miss, without requiring extensive manual tuning. We can reduce the resizing triggers compared to fixed-size scheduling around ~ 10% – 15%, reduce over-provisioning of CPU and Memory through periodic-based provisioning. Also a ~ 60% on multiscale resource forecasting for traces showing periodicity at different levels in respect to single-scale.This work has been partially supported by the Spanish Government (contract PID2019-107255GB) and by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Autoscaling Method for Docker Swarm Towards Bursty Workload

    Get PDF
    The autoscaling mechanism of cloud computing can automatically adjust computing resources according to user needs, improve quality of service (QoS) and avoid over-provision. However, the traditional autoscaling methods suffer from oscillation and degradation of QoS when dealing with burstiness. Therefore, the autoscaling algorithm should consider the effect of bursty workloads. In this paper, we propose a novel AmRP (an autoscaling method that combines reactive and proactive mechanisms) that uses proactive scaling to launch some containers in advance, and then the reactive module performs vertical scaling based on existing containers to increase resources rapidly. Our method also integrates burst detection to alleviate the oscillation of the scaling algorithm and improve the QoS. Finally, we evaluated our approach with state-of-the-art baseline scaling methods under different workloads in a Docker Swarm cluster. Compared with the baseline methods, the experimental results show that AmRP has fewer SLA violations when dealing with bursty workloads, and its resource cost is also lower
    • …
    corecore