572 research outputs found

    A study on performance measures for auto-scaling CPU-intensive containerized applications

    Get PDF
    Autoscaling of containers can leverage performance measures from the different layers of the computational stack. This paper investigate the problem of selecting the most appropriate performance measure to activate auto-scaling actions aiming at guaranteeing QoS constraints. First, the correlation between absolute and relative usage measures and how a resource allocation decision can be influenced by them is analyzed in different workload scenarios. Absolute and relative measures could assume quite different values. The former account for the actual utilization of resources in the host system, while the latter account for the share that each container has of the resources used. Then, the performance of a variant of Kubernetes’ auto-scaling algorithm, that transparently uses the absolute usage measures to scale-in/out containers, is evaluated through a wide set of experiments. Finally, a detailed analysis of the state-of-the-art is presented

    Scanflow-K8s: agent-based framework for autonomic management and supervision of ML workflows in Kubernetes clusters

    Get PDF
    Machine Learning (ML) projects are currently heavily based on workflows composed of some reproducible steps and executed as containerized pipelines to build or deploy ML models efficiently because of the flexibility, portability, and fast delivery they provide to the ML life-cycle. However, deployed models need to be watched and constantly managed, supervised, and debugged to guarantee their availability, validity, and robustness in unexpected situations. Therefore, containerized ML workflows would benefit from leveraging flexible and diverse autonomic capabilities. This work presents an architecture for autonomic ML workflows with abilities for multi-layered control, based on an agent-based approach that enables autonomic management and supervision of ML workflows at the application layer and the infrastructure layer (by collaborating with the orchestrator). We redesign the Scanflow ML framework to support such multi-agent approach by using triggers, primitives, and strategies. We also implement a practical platform, so-called Scanflow-K8s, that enables autonomic ML workflows on Kubernetes clusters based on the Scanflow agents. MNIST image classification and MLPerf ImageNet classification benchmarks are used as case studies to show the capabilities of Scanflow-K8s under different scenarios. The experimental results demonstrate the feasibility and effectiveness of our proposed agent approach and the Scanflow-K8s platform for the autonomic management of ML workflows in Kubernetes clusters at multiple layers.This work was supported by Lenovo as part of Lenovo-BSC 2020 collaboration agreement, by the Spanish Government under contract PID2019-107255GB-C22, and by the Generalitat de Catalunya under contract 2017-SGR-1414 and under grant 2020 FI-B 00257.Peer ReviewedPostprint (author's final draft

    Adaptive microservice scaling for elastic applications

    Get PDF

    Burst-aware predictive autoscaling for containerized microservices

    Get PDF
    Autoscaling methods are used for cloud-hosted applications to dynamically scale the allocated resources for guaranteeing Quality-of-Service (QoS). The public-facing application serves dynamic workloads, which contain bursts and pose challenges for autoscaling methods to ensure application performance. Existing State-of-the-art autoscaling methods are burst-oblivious to determine and provision the appropriate resources. For dynamic workloads, it is hard to detect and handle bursts online for maintaining application performance. In this article, we propose a novel burst-aware autoscaling method which detects burst in dynamic workloads using workload forecasting, resource prediction, and scaling decision making while minimizing response time service-level objectives (SLO) violations. We evaluated our approach through a trace-driven simulation, using multiple synthetic and realistic bursty workloads for containerized microservices, improving performance when comparing against existing state-of-the-art autoscaling methods. Such experiments show an increase of × 1.09 in total processed requests, a reduction of × 5.17 for SLO violations, and an increase of × 0.767 cost as compared to the baseline method.This work was partially supported by the European Research Council (ERC) under the EU Horizon 2020 programme (GA 639595), the Spanish Ministry of Economy, Industry and Competitiveness (TIN2015-65316-P and IJCI2016-27485) and the Generalitat de Catalunya (2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Hybrid Cloud Workload Monitoring as a Service

    Get PDF
    Cloud computing and cloud-based hosting has become embedded in our daily lives. It is imperative for cloud providers to make sure all services used by both enterprises and consumers have high availability and elasticity to prevent any downtime, which impacts negatively for any business. To ensure cloud infrastructures are working reliably, cloud monitoring becomes an essential need for both businesses, the provider and the consumer. This thesis project reports on the need of efficient scalable monitoring, enumerating the necessary types of metrics of interest to be collected. Current understanding of various architectures designed to collect, store and process monitoring data to provide useful insight is surveyed. The pros and cons of each architecture and when such architecture should be used, based on deployment style and strategy, is also reported in the survey. Finally, the essential characteristics of a cloud monitoring system, primarily the features they host to operationalize an efficient monitoring framework, are provided as part of this review. While its apparent that embedded and decentralized architectures are the current favorite in the industry, service-oriented architectures are gaining traction. This project aims to build a light-weight, scalable, embedded monitoring tool which collects metrics at different layers of the cloud stack and aims at achieving correlation in resource-consumption between layers. Future research can be conducted on efficient machine learning models used on the monitoring data to predict resource usage spikes pre-emptively

    Container Resource Allocation versus Performance of Data-intensive Applications on Different Cloud Servers

    Full text link
    In recent years, data-intensive applications have been increasingly deployed on cloud systems. Such applications utilize significant compute, memory, and I/O resources to process large volumes of data. Optimizing the performance and cost-efficiency for such applications is a non-trivial problem. The problem becomes even more challenging with the increasing use of containers, which are popular due to their lower operational overheads and faster boot speed at the cost of weaker resource assurances for the hosted applications. In this paper, two containerized data-intensive applications with very different performance objectives and resource needs were studied on cloud servers with Docker containers running on Intel Xeon E5 and AMD EPYC Rome multi-core processors with a range of CPU, memory, and I/O configurations. Primary findings from our experiments include: 1) Allocating multiple cores to a compute-intensive application can improve performance, but only if the cores do not contend for the same caches, and the optimal core counts depend on the specific workload; 2) allocating more memory to a memory-intensive application than its deterministic data workload does not further improve performance; however, 3) having multiple such memory-intensive containers on the same server can lead to cache and memory bus contention leading to significant and volatile performance degradation. The comparative observations on Intel and AMD servers provided insights into trade-offs between larger numbers of distributed chiplets interconnected with higher speed buses (AMD) and larger numbers of centrally integrated cores and caches with lesser speed buses (Intel). For the two types of applications studied, the more distributed caches and faster data buses have benefited the deployment of larger numbers of containers

    RobotKube: Orchestrating Large-Scale Cooperative Multi-Robot Systems with Kubernetes and ROS

    Full text link
    Modern cyber-physical systems (CPS) such as Cooperative Intelligent Transport Systems (C-ITS) are increasingly defined by the software which operates these systems. In practice, microservice architectures can be employed, which may consist of containerized microservices running in a cluster comprised of robots and supporting infrastructure. These microservices need to be orchestrated dynamically according to ever changing requirements posed at the system. Additionally, these systems are embedded in DevOps processes aiming at continually updating and upgrading both the capabilities of CPS components and of the system as a whole. In this paper, we present RobotKube, an approach to orchestrating containerized microservices for large-scale cooperative multi-robot CPS based on Kubernetes. We describe how to automate the orchestration of software across a CPS, and include the possibility to monitor and selectively store relevant accruing data. In this context, we present two main components of such a system: an event detector capable of, e.g., requesting the deployment of additional applications, and an application manager capable of automatically configuring the required changes in the Kubernetes cluster. By combining the widely adopted Kubernetes platform with the Robot Operating System (ROS), we enable the use of standard tools and practices for developing, deploying, scaling, and monitoring microservices in C-ITS. We demonstrate and evaluate RobotKube in an exemplary and reproducible use case that we make publicly available at https://github.com/ika-rwth-aachen/robotkube .Comment: 7 pages, 2 figures, 2 tables; Accepted to be published as part of the 26th IEEE International Conference on Intelligent Transportation Systems (ITSC), Bilbao, Spain, September 24-28, 202

    Enabling modular design of an application-level auto-scaling and orchestration framework using tosca-based application description templates

    Get PDF
    This paper presents a novel approach to writing TOSCA templates for application reusability and portability in a modular auto-scaling and orchestration framework (MiCADO). The approach defines cloud resources as well as application containers in a flexible and generic way, and allows for those definitions to be extended with specific properties related to a desired container orchestrator chosen at deployment time. The approach is demonstrated in a proof-of-concept where only a minor change was required to a previously used application template in order to achieve the successful deployment and lifecycle management of the popular web authoring tool Wordpress on a new realization of the MiCADO framework featuring a different container orchestrator

    IoT@run-time: a model-based approach to support deployment and self-adaptations in IoT systems

    Get PDF
    Today, most Internet of Things (IoT) systems leverage edge and fog computing to meet increasingly restrictive requirements and improve quality of service (QoS). Although these multi-layer architectures can improve system performance, their design is challenging because the dynamic and changing IoT environment can impact the QoS and system operation. In this thesis, we propose a modeling-based approach that addresses the limitations of existing studies to support the design, deployment, and management of self-adaptive IoT systems. We have designed a domain specific language (DSL) to specify the self-adaptive IoT system, a code generator that generates YAML manifests for the deployment of the IoT system, and a framework based on the MAPE-K loop to monitor and adapt the IoT system at runtime. Finally, we have conducted several experimental studies to validate the expressiveness and usability of the DSL and to evaluate the ability and performance of our framework to address the growth of concurrent adaptations on an IoT system.Hoy en día, la mayoría de los sistemas de internet de las cosas (IoT, por su sigla en inglés) aprovechan la computación en el borde (edge computing) y la computación en la niebla (fog computing) para cumplir requisitos cada vez más restrictivos y mejorar la calidad del servicio. Aunque estas arquitecturas multicapa pueden mejorar el rendimiento del sistema, diseñarlas supone un reto debido a que el entorno de IoT dinámico y cambiante puede afectar a la calidad del servicio y al funcionamiento del sistema. En esta tesis proponemos un enfoque basado en el modelado que aborda las limitaciones de los estudios existentes para dar soporte en el diseño, el despliegue y la gestión de sistemas de IoT autoadaptables. Hemos diseñado un lenguaje de dominio específico (DSL) para modelar el sistema de IoT autoadaptable, un generador de código que produce manifiestos YAML para el despliegue del sistema de IoT y un marco basado en el bucle MAPE-K para monitorizar y adaptar el sistema de IoT en tiempo de ejecución. Por último, hemos llevado a cabo varios estudios experimentales para validar la expresividad y usabilidad del DSL y evaluar la capacidad y el rendimiento de nuestro marco para abordar el crecimiento de las adaptaciones concurrentes en un sistema de IoT.Avui dia, la majoria dels sistemes d'internet de les coses (IoT, per la sigla en anglès) aprofiten la informàtica a la perifèria (edge computing) i la informàtica a la boira (fog computing) per complir requisits cada cop més restrictius i millorar la qualitat del servei. Tot i que aquestes arquitectures multicapa poden millorar el rendiment del sistema, dissenyar-les suposa un repte perquè l'entorn d'IoT dinàmic i canviant pot afectar la qualitat del servei i el funcionament del sistema. En aquesta tesi proposem un enfocament basat en el modelatge que aborda les limitacions dels estudis existents per donar suport al disseny, el desplegament i la gestió de sistemes d'IoT autoadaptatius. Hem dissenyat un llenguatge de domini específic (DSL) per modelar el sistema d'IoT autoadaptatiu, un generador de codi que produeix manifestos YAML per al desplegament del sistema d'IoT i un marc basat en el bucle MAPE-K per monitorar i adaptar el sistema d'IoT en temps d'execució. Finalment, hem dut a terme diversos estudis experimentals per validar l'expressivitat i la usabilitat del DSL i avaluar la capacitat i el rendiment del nostre marc per abordar el creixement de les adaptacions concurrents en un sistema d'IoT.Tecnologies de la informació i de xarxe

    A highly-available and scalable microservice architecture for access management

    Get PDF
    Access management is a key aspect of providing secure services and applications in information technology. Ensuring secure access is particularly challenging in a cloud environment wherein resources are scaled dynamically. In fact keeping track of dynamic cloud instances and administering access to them requires careful coordination and mechanisms to ensure reliable operations. PrivX is a commercial offering from SSH Communications and Security Oyj that automatically scans and keeps track of the cloud instances and manages access to them. PrivX is currently built on the microservices approach, wherein the application is structured as a collection of loosely coupled services. However, PrivX requires external modules and with specific capabilities to ensure high availability. Moreover, complex scripts are required to monitor the whole system. The goal of this thesis is to make PrivX highly-available and scalable by using a container orchestration framework. To this end, we first conduct a detailed study of mostly widely used container orchestration frameworks: Kubernetes, Docker Swarm and Nomad. We then select Kubernetes based on a feature evaluation relevant to the considered scenario. We package the individual components of PrivX, including its database, into Docker containers and deploy them on a Kubernetes cluster. We also build a prototype system to demonstrate how microservices can be managed on a Kubernetes cluster. Additionally, an auto scaling tool is created to scale specific services based on predefined rules. Finally, we evaluate the service recovery time for each of the services in PrivX, both in the RPM deployment model and the prototype Kubernetes deployment model. We find that there is no significant difference in service recovery time between the two models. However, Kubernetes ensured high availability of the services. We find that Kubernetes is the preferred mode for deploying PrivX and it makes PrivX highly available and scalable
    corecore