Search CORE

245 research outputs found

Self-adaptive trade-off decision making for autoscaling cloud-based services

Author: Bahsoon R
Chen T
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/11/2015
Field of study

Elasticity in the cloud is often achieved by on-demand autoscaling. In such context, the goal is to optimize the Quality of Service (QoS) and cost objectives for the cloud-based services. However, the difficulty lies in the facts that these objectives, e.g., throughput and cost, can be naturally conflicted; and the QoS of cloud-based services often interfere due to the shared infrastructure in cloud. Consequently, dynamic and effective trade-off decision making of autoscaling in the cloud is necessary, yet challenging. In particular, it is even harder to achieve well-compromised trade-offs, where the decision largely improves the majority of the objectives; while causing relatively small degradations to others. In this paper, we present a self-adaptive decision making approach for autoscaling in the cloud. It is capable to adaptively produce autoscaling decisions that lead to well-compromised trade-offs without heavy human intervention. We leverage on ant colony inspired multi-objective optimization for searching and optimizing the trade-offs decisions, the result is then filtered by compromise-dominance, a mechanism that extracts the decisions with balanced improvements in the trade-offs. We experimentally compare our approach to four state-of-the-arts autoscaling approaches: rule, heuristic, randomized and multi-objective genetic algorithm based solutions. The results reveal the effectiveness of our approach over the others, including better quality of trade-offs and significantly smaller violation of the requirements

arXiv.org e-Print Archive

Crossref

Loughborough University Institutional Repository

University of Birmingham Research Portal

Nottingham Trent Institutional Repository (IRep)

Self-adaptive trade-off decision making for autoscaling cloud-based services

Author: Rami Bahsoon (7401395)
Tao Chen (7050698)
Publication venue
Publication date: 11/11/2015
Field of study

Loughborough University Institutional Repository

Self-aware and self-adaptive autoscaling for cloud based services

Author: Chen Tao
Publication venue
Publication date: 01/07/2016
Field of study

Modern Internet services are increasingly leveraging on cloud computing for flexible, elastic and on-demand provision. Typically, Quality of Service (QoS) of cloud-based services can be tuned using different underlying cloud configurations and resources, e.g., number of threads, CPU and memory etc., which are shared, leased and priced as utilities. This benefit is fundamentally grounded by autoscaling: an automatic and elastic process that adapts cloud configurations on-demand according to time-varying workloads. This thesis proposes a holistic cloud autoscaling framework to effectively and seamlessly address existing challenges related to different logical aspects of autoscaling, including architecting autoscaling system, modelling the QoS of cloudbased service, determining the granularity of control and deciding trade-off autoscaling decisions. The framework takes advantages of the principles of self-awareness and the related algorithms to adaptively handle the dynamics, uncertainties, QoS interference and trade-offs on objectives that are exhibited in the cloud. The major benefit is that, by leveraging the framework, cloud autoscaling can be effectively achieved without heavy human analysis and design time knowledge. Through conducting various experiments using RUBiS benchmark and realistic workload on real cloud setting, this thesis evaluates the effectiveness of the framework based on various quality indicators and compared with other state-of-the-art approaches

arXiv.org e-Print Archive

University of Birmingham Research Archive, E-theses Repository

The handbook of engineering self-aware and self-expressive systems

Author: Bahsoon Rami
Chen Tao
Esterle Lukas
Faniyi Funmilade
Minku Leandro L.
R. Lewis Peter
Yao Xin
Publication venue
Publication date: 05/09/2014
Field of study

When faced with the task of designing and implementing a new self-aware and self-expressive computing system, researchers and practitioners need a set of guidelines on how to use the concepts and foundations developed in the Engineering Proprioception in Computing Systems (EPiCS) project. This report provides such guidelines on how to design self-aware and self-expressive computing systems in a principled way. We have documented different categories of self-awareness and self-expression level using architectural patterns. We have also documented common architectural primitives, their possible candidate techniques and attributes for architecting self-aware and self-expressive systems. Drawing on the knowledge obtained from the previous investigations, we proposed a pattern driven methodology for engineering self-aware and self-expressive systems to assist in utilising the patterns and primitives during design. The methodology contains detailed guidance to make decisions with respect to the possible design alternatives, providing a systematic way to build self-aware and self-expressive systems. Then, we qualitatively and quantitatively evaluated the methodology using two case studies. The results reveal that our pattern driven methodology covers the main aspects of engineering self-aware and self-expressive systems, and that the resulted systems perform significantly better than the non-self-aware systems

arXiv.org e-Print Archive

Aston Publications Explorer

Leicester Research Archive

Control Strategies for Improving Cloud Service Robustness

Author: Dürango Jonas
Publication venue: Department of Automatic Control, Lund Institute of Technology, Lund University
Publication date: 14/06/2016
Field of study

This thesis addresses challenges in increasing the robustness of cloud-deployed applications and services to unexpected events and dynamic workloads. Without precautions, hardware failures and unpredictable large traffic variations can quickly degrade the performance of an application due to mismatch between provisioned resources and capacity needs. Similarly, disasters, such as power outages and fire, are unexpected events on larger scale that threatens the integrity of the underlying infrastructure on which an application is deployed.First, the self-adaptive software concept of brownout is extended to replicated cloud applications. By monitoring the performance of each application replica, brownout is able to counteract temporary overload situations by reducing the computational complexity of jobs entering the system. To avoid existing load balancers interfering with the brownout functionality, brownout-aware load balancers are introduced. Simulation experiments show that the proposed load balancers outperform existing load balancers in providing a high quality of service to as many end users as possible. Experiments in a testbed environment further show how a replicated brownout-enabled application is able to maintain high performance during overloads as compared to its non-brownout equivalent.Next, a feedback controller for cloud autoscaling is introduced. Using a novel way of modeling the dynamics of typical cloud application, a mechanism similar to the classical Smith predictor to compensate for delays in reconfiguring resource provisioning is presented. Simulation experiments show that the feedback controller is able to achieve faster control of the response times of a cloud application as compared to a threshold-based controller.Finally, a solution for handling the trade-off between performance and disaster tolerance for geo-replicated cloud applications is introduced. An automated mechanism for differentiating application traffic and replication traffic, and dynamically managing their bandwidth allocations using an MPC controller is presented and evaluated in simulation. Comparisons with commonly used static approaches reveal that the proposed solution in overload situations provides increased flexibility in managing the trade-off between performance and data consistency

Lund University Publications

IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency

Author: Doyle Joseph
Ghafouri Saeid
Jamshidi Pooyan
Lorido-Botran Tania
Razavi Kamran
Salmani Mehran
Sanaee Alireza
Wang Lin
Publication venue
Publication date: 24/08/2023
Field of study

Efficiently optimizing multi-model inference pipelines for fast, accurate, and cost-effective inference is a crucial challenge in ML production systems, given their tight end-to-end latency requirements. To simplify the exploration of the vast and intricate trade-off space of accuracy and cost in inference pipelines, providers frequently opt to consider one of them. However, the challenge lies in reconciling accuracy and cost trade-offs. To address this challenge and propose a solution to efficiently manage model variants in inference pipelines, we present IPA, an online deep-learning Inference Pipeline Adaptation system that efficiently leverages model variants for each deep learning task. Model variants are different versions of pre-trained models for the same deep learning task with variations in resource requirements, latency, and accuracy. IPA dynamically configures batch size, replication, and model variants to optimize accuracy, minimize costs, and meet user-defined latency SLAs using Integer Programming. It supports multi-objective settings for achieving different trade-offs between accuracy and cost objectives while remaining adaptable to varying workloads and dynamic traffic patterns. Extensive experiments on a Kubernetes implementation with five real-world inference pipelines demonstrate that IPA improves normalized accuracy by up to 35% with a minimal cost increase of less than 5%

arXiv.org e-Print Archive