Search CORE

562 research outputs found

Analytically-Driven Resource Management for Cloud-Native Microservices

Author: Delimitrou Christina
Elnikety Sameh
Zhang Yanqi
Zhou Zhuangzhuang
Publication venue
Publication date: 05/01/2024
Field of study

Resource management for cloud-native microservices has attracted a lot of recent attention. Previous work has shown that machine learning (ML)-driven approaches outperform traditional techniques, such as autoscaling, in terms of both SLA maintenance and resource efficiency. However, ML-driven approaches also face challenges including lengthy data collection processes and limited scalability. We present Ursa, a lightweight resource management system for cloud-native microservices that addresses these challenges. Ursa uses an analytical model that decomposes the end-to-end SLA into per-service SLA, and maps per-service SLA to individual resource allocations per microservice tier. To speed up the exploration process and avoid prolonged SLA violations, Ursa explores each microservice individually, and swiftly stops exploration if latency exceeds its SLA. We evaluate Ursa on a set of representative and end-to-end microservice topologies, including a social network, media service and video processing pipeline, each consisting of multiple classes and priorities of requests with different SLAs, and compare it against two representative ML-driven systems, Sinan and Firm. Compared to these ML-driven approaches, Ursa provides significant advantages: It shortens the data collection process by more than 128x, and its control plane is 43x faster than ML-driven approaches. At the same time, Ursa does not sacrifice resource efficiency or SLAs. During online deployment, Ursa reduces the SLA violation rate by 9.0% up to 49.9%, and reduces CPU allocation by up to 86.2% compared to ML-driven approaches

arXiv.org e-Print Archive

Recommended from our members

Elastic Resource Management in Distributed Clouds

Author: Guo Tian
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2016
Field of study

The ubiquitous nature of computing devices and their increasing reliance on remote resources have driven and shaped public cloud platforms into unprecedented large-scale, distributed data centers. Concurrently, a plethora of cloud-based applications are experiencing multi-dimensional workload dynamics---workload volumes that vary along both time and space axes and with higher frequency. The interplay of diverse workload characteristics and distributed clouds raises several key challenges for efficiently and dynamically managing server resources. First, current cloud platforms impose certain restrictions that might hinder some resource management tasks. Second, an application-agnostic approach might not entail appropriate performance goals, therefore, requires numerous specific methods. Third, provisioning resources outside LAN boundary might incur huge delay which would impact the desired agility. In this dissertation, I investigate the above challenges and present the design of automated systems that manage resources for various applications in distributed clouds. The intermediate goal of these automated systems is to fully exploit potential benefits such as reduced network latency offered by increasingly distributed server resources. The ultimate goal is to improve end-to-end user response time with novel resource management approaches, within a certain cost budget. Centered around these two goals, I first investigate how to optimize the location and performance of virtual machines in distributed clouds. I use virtual desktops, mostly serving a single user, as an example use case for developing a black-box approach that ranks virtual machines based on their dynamic latency requirements. Those with high latency sensitivities have a higher priority of being placed or migrated to a cloud location closest to their users. Next, I relax the assumption of well-provisioned virtual machines and look at how to provision enough resources for applications that exhibit both temporal and spatial workload fluctuations. I propose an application-agnostic queueing model that captures the resource utilization and server response time. Building upon this model, I present a geo-elastic provisioning approach---referred as geo-elasticity---for replicable multi-tier applications that can spin up an appropriate amount of server resources in any cloud locations. Last, I explore the benefits of providing geo-elasticity for database clouds, a popular platform for hosting application backends. Performing geo-elastic provisioning for backend database servers entails several challenges that are specific to database workload, and therefore requires tailored solutions. In addition, cloud platforms offer resources at various prices for different locations. Towards this end, I propose a cost-aware geo-elasticity that combines a regression-based workload model and a queueing network capacity model for database clouds. In summary, hosting a diverse set of applications in an increasingly distributed cloud makes it interesting and necessary to develop new, efficient and dynamic resource management approaches

ScholarWorks@UMass Amherst

Provisioning multi-tier cloud applications using statistical bounds on sojourn time

Author: Don Towsley
Prashant Shenoy
Upendra Sharma
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

In this paper we present a simple and effective approach for re-source provisioning to achieve a percentile bound on the end to end response time of a multi-tier application. We, at first, model the multi-tier application as an open tandem network of M/G/1-PS queues and develop a method that produces a near optimal appli-cation configuration, i.e, number of servers at each tier, to meet the percentile bound in a homogeneous server environment – using a single type of server. We then extend our solution to a K-server case and our technique demonstrates a good accuracy, independent of the variability of service-times. Our approach demonstrates a provisioning error of no more than 3 % compared to a 140 % worst case provisioning error obtained by techniques based on anM/M/1-FCFS queue model. In addition, we extend our approach to han-dle a heterogenous server environment, i.e., with multiple types of servers. We find that fewer high-capacity servers are preferable for high percentile provisioning. Finally, we extend our approach to account for the rental cost of each server-type and compute a cost efficient application configuration with savings of over 80%. We demonstrate the applicability of our approach in a real world sys-tem by employing it to provision the two tiers of the java implemen-tation of TPC-W – a multi-tier transactional web benchmark that represents an e-commerce web application, i.e. an online book-store

CiteSeerX

Crossref

Coarse-Grain QoS-Aware Dynamic Instance Provisioning for Interactive Workload in the Cloud

Author: Jianxiong Wan
Jie Lv
Limin Liu
Zhiwei Xu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

Crossref

CDOXplorer: Simulation-based genetic optimization of software deployment and reconfiguration in the cloud

Author: Fittkau Florian
Frey Sören
Hasselbring Wilhelm
Publication venue
Publication date: 01/01/2018
Field of study

Migrating existing enterprise software to cloud platforms involves the comparison of various cloud deployment options (CDOs). A CDO comprises a combination of a specific cloud environment, deployment architecture, and runtime reconfiguration rules for dynamic resource scaling. Our simulator CDOSim can evaluate CDOs, e.g., regarding response times and costs. However, the design space to be searched for well-suited solutions is very large. In this paper, we approach this optimization problem with the novel genetic algorithm CDOXplorer. It uses techniques of the search-based software engineering field and simulations with CDOSim to assess the fitness of CDOs. An experimental evaluation that employs, among others, the cloud environments Amazon EC2 and Microsoft Windows Azure, shows that CDOXplorer can find solutions that surpass those of other state-of-the-art techniques by up to 60\%. Our experiment code and data and an implementation of CDOXplorer are available as open source software

MACAU: Open Access Repository of Kiel University

Allocation of Virtual Machines in Cloud Data Centers - A Survey of Problem Models and Optimization Algorithms

Author: Mann Zoltán Ádám
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

Data centers in public, private, and hybrid cloud settings make it possible to provision virtual machines (VMs) with unprecedented flexibility. However, purchasing, operating, and maintaining the underlying physical resources incurs significant monetary costs and also environmental impact. Therefore, cloud providers must optimize the usage of physical resources by a careful allocation of VMs to hosts, continuously balancing between the conflicting requirements on performance and operational costs. In recent years, several algorithms have been proposed for this important optimization problem. Unfortunately, the proposed approaches are hardly comparable because of subtle differences in the used problem models. This paper surveys the used problem formulations and optimization algorithms, highlighting their strengths and limitations, also pointing out the areas that need further research in the future

CiteSeerX

Crossref

Repository of the Academy's Library

Recommended from our members

Model-based resource management for fine-grained services

Author: Gias Alim Ul
Publication venue: Computing, Imperial College London
Publication date: 01/08/2022
Field of study

The emergence of DevOps has changed the way modern distributed software systems are developed. Architectures decomposed in fine-grained services, such as microservices or function-as-a-service (FaaS), are now widespread across many organizations. From a resource management perspective, although the systems built with such architectures have many benefits, there are still research challenges that need further attention. In this study, we have focused on three such challenges, each concerning a specific system resource: compute, memory, or storage. Firstly, we focus on scaling the capacity of microservices at runtime. Here, the challenge is to design an autoscaler that can decide between vertical and horizontal scaling options to distribute the CPU capacity. Secondly, we focus on estimating the required capacity of an on-premises FaaS platform such that the service level agreements (SLAs) for function response times are satisfied. The challenge here is to address the cold start dilemma, i.e., that a cold start delays a function response but reduces the memory consumption. Thus, we must find a limit of cold starts such that the memory-consumption remains in-check while satisfying the SLAs. Finally, we focus on the storage management for distributed tracing targeted at microservices. The volume of such traces generated in a data center can be in the scale of tens of terabytes per day, but only a small fraction of these traces is useful for troubleshooting. The objective then is to sample only the useful traces. The key to addressing all these challenges is first, modeling the dynamics concerning the resources and subsequently, leveraging the model in a resource controller. To address the first challenge, we have developed an autoscaler ATOM that leverages layered queueing network (LQN) models to take its scaling decisions. Our experiment, with a real-life application, shows that ATOM produces 30-37% better results than the baseline autoscalers. For the second challenge, we have developed COCOA, a cold start aware capacity planner. COCOA utilizes M/M/k setup and LQN models to assess the cold start scenario and estimate the required capacity. We show with simulation that COCOA can reduce over-provisioning by over 70% compared to the availability aware approaches. Finally, addressing the third challenge, we propose SampleHST, a trace sampler that works under a storage budget constraint. SampleHST relies on either bag of words or graph-based models to represent a trace and groups similar traces using online clustering to perform sampling. We have evaluated the performance of SampleHST using data from both literature and production, which shows it produces 1.2x to 19x better results than the state-of-the-art.Open Acces

City Research Online

Spiral - Imperial College Digital Repository

CLOUD RESOURCE MANAGEMENT USING A HIERARCHICAL DECENTRALIZED FRAMEWORK

Author: Hummaida Abdul
Publication venue
Publication date: 31/12/2022
Field of study

The University of Manchester - Institutional Repository

COSCO: container orchestration using co-simulation and gradient based optimization for fog computing environments

Author: Casale G
Jennings NR
Poojara S
Srirama SN
Tuli S
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 03/06/2021
Field of study

Intelligent task placement and management of tasks in large-scale fog platforms is challenging due to the highly volatile nature of modern workload applications and sensitive user requirements of low energy consumption and response time. Container orchestration platforms have emerged to alleviate this problem with prior art either using heuristics to quickly reach scheduling decisions or AI driven methods like reinforcement learning and evolutionary approaches to adapt to dynamic scenarios. The former often fail to quickly adapt in highly dynamic environments, whereas the latter have run-times that are slow enough to negatively impact response time. Therefore, there is a need for scheduling policies that are both reactive to work efficiently in volatile environments and have low scheduling overheads. To achieve this, we propose a Gradient Based Optimization Strategy using Back-propagation of gradients with respect to Input (GOBI). Further, we leverage the accuracy of predictive digital-twin models and simulation capabilities by developing a Coupled Simulation and Container Orchestration Framework (COSCO). Using this, we create a hybrid simulation driven decision approach, GOBI*, to optimize Quality of Service (QoS) parameters. Co-simulation and the back-propagation approaches allow these methods to adapt quickly in volatile environments. Experiments conducted using real-world data on fog applications using the GOBI and GOBI* methods, show a significant improvement in terms of energy consumption, response time, Service Level Objective and scheduling time by up to 15, 40, 4, and 82 percent respectively when compared to the state-of-the-art algorithms

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

A control-theoretic approach towards joint admission control and resource allocation of cloud computing services

Author: Almeida
Ardagna
Bamieh
Blanchini
Boyd
Cherkasova
Gandhi
Khalil
Kusic
Liu
Ljung
Poussot-Vassal
Qin
Tóth
Vandenberghe
Wang
Wellstead
Wheelwright
Xie
Publication venue: 'Wiley'
Publication date
Field of study

Crossref