572 research outputs found
A study on performance measures for auto-scaling CPU-intensive containerized applications
Autoscaling of containers can leverage performance measures from the different layers of the computational stack. This paper investigate the problem of selecting the most appropriate performance measure to activate auto-scaling actions aiming at guaranteeing QoS constraints. First, the correlation between absolute and relative usage measures and how a resource allocation decision can be influenced by them is analyzed in different workload scenarios. Absolute and relative measures could assume quite different values. The former account for the actual utilization of resources in the host system, while the latter account for the share that each container has of the resources used. Then, the performance of a variant of Kubernetes’ auto-scaling algorithm, that transparently uses the absolute usage measures to scale-in/out containers, is evaluated through a wide set of experiments. Finally, a detailed analysis of the state-of-the-art is presented
Scanflow-K8s: agent-based framework for autonomic management and supervision of ML workflows in Kubernetes clusters
Machine Learning (ML) projects are currently heavily based on workflows composed of some reproducible steps and executed as containerized pipelines to build or deploy ML models efficiently because of the flexibility, portability, and fast delivery they provide to the ML life-cycle. However, deployed models need to be watched and constantly managed, supervised, and debugged to guarantee their availability, validity, and robustness in unexpected situations. Therefore, containerized ML workflows would benefit from leveraging flexible and diverse autonomic capabilities. This work presents an architecture for autonomic ML workflows with abilities for multi-layered control, based on an agent-based approach that enables autonomic management and supervision of ML workflows at the application layer and the infrastructure layer (by collaborating with the orchestrator). We redesign the Scanflow ML framework to support such multi-agent approach by using triggers, primitives, and strategies. We also implement a practical platform, so-called Scanflow-K8s, that enables autonomic ML workflows on Kubernetes clusters based on the Scanflow agents. MNIST image classification and MLPerf ImageNet classification benchmarks are used as case studies to show the capabilities of Scanflow-K8s under different scenarios. The experimental results demonstrate the feasibility and effectiveness of our proposed agent approach and the Scanflow-K8s platform for the autonomic management of ML workflows in Kubernetes clusters at multiple layers.This work was supported by Lenovo as part of Lenovo-BSC 2020 collaboration agreement, by the Spanish Government under contract PID2019-107255GB-C22, and by the Generalitat de Catalunya under contract 2017-SGR-1414 and under grant 2020 FI-B 00257.Peer ReviewedPostprint (author's final draft
Burst-aware predictive autoscaling for containerized microservices
Autoscaling methods are used for cloud-hosted applications to dynamically scale the allocated resources for guaranteeing Quality-of-Service (QoS). The public-facing application serves dynamic workloads, which contain bursts and pose challenges for autoscaling methods to ensure application performance. Existing State-of-the-art autoscaling methods are burst-oblivious to determine and provision the appropriate resources. For dynamic workloads, it is hard to detect and handle bursts online for maintaining application performance. In this article, we propose a novel burst-aware autoscaling method which detects burst in dynamic workloads using workload forecasting, resource prediction, and scaling decision making while minimizing response time service-level objectives (SLO) violations. We evaluated our approach through a trace-driven simulation, using multiple synthetic and realistic bursty workloads for containerized microservices, improving performance when comparing against existing state-of-the-art autoscaling methods. Such experiments show an increase of × 1.09 in total processed requests, a reduction of × 5.17 for SLO violations, and an increase of × 0.767 cost as compared to the baseline method.This work was partially supported by the European Research Council (ERC) under the EU Horizon 2020 programme (GA 639595), the Spanish Ministry of Economy, Industry and Competitiveness (TIN2015-65316-P and IJCI2016-27485) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft
Hybrid Cloud Workload Monitoring as a Service
Cloud computing and cloud-based hosting has become embedded in our daily lives. It is imperative for cloud providers to make sure all services used by both enterprises and consumers have high availability and elasticity to prevent any downtime, which impacts negatively for any business. To ensure cloud infrastructures are working reliably, cloud monitoring becomes an essential need for both businesses, the provider and the consumer. This thesis project reports on the need of efficient scalable monitoring, enumerating the necessary types of metrics of interest to be collected. Current understanding of various architectures designed to collect, store and process monitoring data to provide useful insight is surveyed. The pros and cons of each architecture and when such architecture should be used, based on deployment style and strategy, is also reported in the survey. Finally, the essential characteristics of a cloud monitoring system, primarily the features they host to operationalize an efficient monitoring framework, are provided as part of this review. While its apparent that embedded and decentralized architectures are the current favorite in the industry, service-oriented architectures are gaining traction. This project aims to build a light-weight, scalable, embedded monitoring tool which collects metrics at different layers of the cloud stack and aims at achieving correlation in resource-consumption between layers. Future research can be conducted on efficient machine learning models used on the monitoring data to predict resource usage spikes pre-emptively
Container Resource Allocation versus Performance of Data-intensive Applications on Different Cloud Servers
In recent years, data-intensive applications have been increasingly deployed
on cloud systems. Such applications utilize significant compute, memory, and
I/O resources to process large volumes of data. Optimizing the performance and
cost-efficiency for such applications is a non-trivial problem. The problem
becomes even more challenging with the increasing use of containers, which are
popular due to their lower operational overheads and faster boot speed at the
cost of weaker resource assurances for the hosted applications. In this paper,
two containerized data-intensive applications with very different performance
objectives and resource needs were studied on cloud servers with Docker
containers running on Intel Xeon E5 and AMD EPYC Rome multi-core processors
with a range of CPU, memory, and I/O configurations. Primary findings from our
experiments include: 1) Allocating multiple cores to a compute-intensive
application can improve performance, but only if the cores do not contend for
the same caches, and the optimal core counts depend on the specific workload;
2) allocating more memory to a memory-intensive application than its
deterministic data workload does not further improve performance; however, 3)
having multiple such memory-intensive containers on the same server can lead to
cache and memory bus contention leading to significant and volatile performance
degradation. The comparative observations on Intel and AMD servers provided
insights into trade-offs between larger numbers of distributed chiplets
interconnected with higher speed buses (AMD) and larger numbers of centrally
integrated cores and caches with lesser speed buses (Intel). For the two types
of applications studied, the more distributed caches and faster data buses have
benefited the deployment of larger numbers of containers
RobotKube: Orchestrating Large-Scale Cooperative Multi-Robot Systems with Kubernetes and ROS
Modern cyber-physical systems (CPS) such as Cooperative Intelligent Transport
Systems (C-ITS) are increasingly defined by the software which operates these
systems. In practice, microservice architectures can be employed, which may
consist of containerized microservices running in a cluster comprised of robots
and supporting infrastructure. These microservices need to be orchestrated
dynamically according to ever changing requirements posed at the system.
Additionally, these systems are embedded in DevOps processes aiming at
continually updating and upgrading both the capabilities of CPS components and
of the system as a whole. In this paper, we present RobotKube, an approach to
orchestrating containerized microservices for large-scale cooperative
multi-robot CPS based on Kubernetes. We describe how to automate the
orchestration of software across a CPS, and include the possibility to monitor
and selectively store relevant accruing data. In this context, we present two
main components of such a system: an event detector capable of, e.g.,
requesting the deployment of additional applications, and an application
manager capable of automatically configuring the required changes in the
Kubernetes cluster. By combining the widely adopted Kubernetes platform with
the Robot Operating System (ROS), we enable the use of standard tools and
practices for developing, deploying, scaling, and monitoring microservices in
C-ITS. We demonstrate and evaluate RobotKube in an exemplary and reproducible
use case that we make publicly available at
https://github.com/ika-rwth-aachen/robotkube .Comment: 7 pages, 2 figures, 2 tables; Accepted to be published as part of the
26th IEEE International Conference on Intelligent Transportation Systems
(ITSC), Bilbao, Spain, September 24-28, 202
Enabling modular design of an application-level auto-scaling and orchestration framework using tosca-based application description templates
This paper presents a novel approach to writing TOSCA templates for application reusability and portability in a modular auto-scaling and orchestration framework (MiCADO). The approach defines cloud resources as well as application containers in a flexible and generic way, and allows for those definitions to be extended with specific properties related to a desired container orchestrator chosen at deployment time. The approach is demonstrated in a proof-of-concept where only a minor change was required to a previously used application template in order to achieve the successful deployment and lifecycle management of the popular web authoring tool Wordpress on a new realization of the MiCADO framework featuring a different container orchestrator
IoT@run-time: a model-based approach to support deployment and self-adaptations in IoT systems
Today, most Internet of Things (IoT) systems leverage edge and fog computing to meet increasingly restrictive requirements and improve quality of service (QoS). Although these multi-layer architectures can improve system performance, their design is challenging because the dynamic and changing IoT environment can impact the QoS and system operation. In this thesis, we propose a modeling-based approach that addresses the limitations of existing studies to support the design, deployment, and management of self-adaptive IoT systems. We have designed a domain specific language (DSL) to specify the self-adaptive IoT system, a code generator that generates YAML manifests for the deployment of the IoT system, and a framework based on the MAPE-K loop to monitor and adapt the IoT system at runtime. Finally, we have conducted several experimental studies to validate the expressiveness and usability of the DSL and to evaluate the ability and performance of our framework to address the growth of concurrent adaptations on an IoT system.Hoy en día, la mayoría de los sistemas de internet de las cosas (IoT, por su sigla en inglés) aprovechan la computación en el borde (edge computing) y la computación en la niebla (fog computing) para cumplir requisitos cada vez más restrictivos y mejorar la calidad del servicio. Aunque estas arquitecturas multicapa pueden mejorar el rendimiento del sistema, diseñarlas supone un reto debido a que el entorno de IoT dinámico y cambiante puede afectar a la calidad del servicio y al funcionamiento del sistema. En esta tesis proponemos un enfoque basado en el modelado que aborda las limitaciones de los estudios existentes para dar soporte en el diseño, el despliegue y la gestión de sistemas de IoT autoadaptables. Hemos diseñado un lenguaje de dominio específico (DSL) para modelar el sistema de IoT autoadaptable, un generador de código que produce manifiestos YAML para el despliegue del sistema de IoT y un marco basado en el bucle MAPE-K para monitorizar y adaptar el sistema de IoT en tiempo de ejecución. Por último, hemos llevado a cabo varios estudios experimentales para validar la expresividad y usabilidad del DSL y evaluar la capacidad y el rendimiento de nuestro marco para abordar el crecimiento de las adaptaciones concurrentes en un sistema de IoT.Avui dia, la majoria dels sistemes d'internet de les coses (IoT, per la sigla en anglès) aprofiten la informàtica a la perifèria (edge computing) i la informàtica a la boira (fog computing) per complir requisits cada cop més restrictius i millorar la qualitat del servei. Tot i que aquestes arquitectures multicapa poden millorar el rendiment del sistema, dissenyar-les suposa un repte perquè l'entorn d'IoT dinàmic i canviant pot afectar la qualitat del servei i el funcionament del sistema. En aquesta tesi proposem un enfocament basat en el modelatge que aborda les limitacions dels estudis existents per donar suport al disseny, el desplegament i la gestió de sistemes d'IoT autoadaptatius. Hem dissenyat un llenguatge de domini específic (DSL) per modelar el sistema d'IoT autoadaptatiu, un generador de codi que produeix manifestos YAML per al desplegament del sistema d'IoT i un marc basat en el bucle MAPE-K per monitorar i adaptar el sistema d'IoT en temps d'execució. Finalment, hem dut a terme diversos estudis experimentals per validar l'expressivitat i la usabilitat del DSL i avaluar la capacitat i el rendiment del nostre marc per abordar el creixement de les adaptacions concurrents en un sistema d'IoT.Tecnologies de la informació i de xarxe
A highly-available and scalable microservice architecture for access management
Access management is a key aspect of providing secure services and applications in information technology. Ensuring secure access is particularly challenging in a cloud environment wherein resources are scaled dynamically. In fact keeping track of dynamic cloud instances and administering access to them requires careful coordination and mechanisms to ensure reliable operations. PrivX is a commercial offering from SSH Communications and Security Oyj that automatically scans and keeps track of the cloud instances and manages access to them. PrivX is currently built on the microservices approach, wherein the application is structured as a collection of loosely coupled services. However, PrivX requires external modules and with specific capabilities to ensure high availability. Moreover, complex scripts are required to monitor the whole system.
The goal of this thesis is to make PrivX highly-available and scalable by using a container orchestration framework. To this end, we first conduct a detailed study of mostly widely used container orchestration frameworks: Kubernetes, Docker Swarm and Nomad. We then select Kubernetes based on a feature evaluation relevant to the considered scenario. We package the individual components of PrivX, including its database, into Docker containers and deploy them on a Kubernetes cluster. We also build a prototype system to demonstrate how microservices can be managed on a Kubernetes cluster. Additionally, an auto scaling tool is created to scale specific services based on predefined rules. Finally, we evaluate the service recovery time for each of the services in PrivX, both in the RPM deployment model and the prototype Kubernetes deployment model. We find that there is no significant difference in service recovery time between the two models. However, Kubernetes ensured high availability of the services. We find that Kubernetes is the preferred mode for deploying PrivX and it makes PrivX highly available and scalable
- …