3,258 research outputs found

    Adaptive microservice scaling for elastic applications

    Get PDF

    DYVERSE: DYnamic VERtical Scaling in Multi-tenant Edge Environments

    Full text link
    Multi-tenancy in resource-constrained environments is a key challenge in Edge computing. In this paper, we develop 'DYVERSE: DYnamic VERtical Scaling in Edge' environments, which is the first light-weight and dynamic vertical scaling mechanism for managing resources allocated to applications for facilitating multi-tenancy in Edge environments. To enable dynamic vertical scaling, one static and three dynamic priority management approaches that are workload-aware, community-aware and system-aware, respectively are proposed. This research advocates that dynamic vertical scaling and priority management approaches reduce Service Level Objective (SLO) violation rates. An online-game and a face detection workload in a Cloud-Edge test-bed are used to validate the research. The merits of DYVERSE is that there is only a sub-second overhead per Edge server when 32 Edge servers are deployed on a single Edge node. When compared to executing applications on the Edge servers without dynamic vertical scaling, static priorities and dynamic priorities reduce SLO violation rates of requests by up to 4% and 12% for the online game, respectively, and in both cases 6% for the face detection workload. Moreover, for both workloads, the system-aware dynamic vertical scaling method effectively reduces the latency of non-violated requests, when compared to other methods

    Machine Learning Algorithms for Provisioning Cloud/Edge Applications

    Get PDF
    Mención Internacional en el título de doctorReinforcement Learning (RL), in which an agent is trained to make the most favourable decisions in the long run, is an established technique in artificial intelligence. Its popularity has increased in the recent past, largely due to the development of deep neural networks spawning deep reinforcement learning algorithms such as Deep Q-Learning. The latter have been used to solve previously insurmountable problems, such as playing the famed game of “Go” that previous algorithms could not. Many such problems suffer the curse of dimensionality, in which the sheer number of possible states is so overwhelming that it is impractical to explore every possible option. While these recent techniques have been successful, they may not be strictly necessary or practical for some applications such as cloud provisioning. In these situations, the action space is not as vast and workload data required to train such systems is not as widely shared, as it is considered commercialy sensitive by the Application Service Provider (ASP). Given that provisioning decisions evolve over time in sympathy to incident workloads, they fit into the sequential decision process problem that legacy RL was designed to solve. However because of the high correlation of time series data, states are not independent of each other and the legacy Markov Decision Processes (MDPs) have to be cleverly adapted to create robust provisioning algorithms. As the first contribution of this thesis, we exploit the knowledge of both the application and configuration to create an adaptive provisioning system leveraging stationary Markov distributions. We then develop algorithms that, with neither application nor configuration knowledge, solve the underlying Markov Decision Process (MDP) to create provisioning systems. Our Q-Learning algorithms factor in the correlation between states and the consequent transitions between them to create provisioning systems that do not only adapt to workloads, but can also exploit similarities between them, thereby reducing the retraining overhead. Our algorithms also exhibit convergence in fewer learning steps given that we restructure the state and action spaces to avoid the curse of dimensionality without the need for the function approximation approach taken by deep Q-Learning systems. A crucial use-case of future networks will be the support of low-latency applications involving highly mobile users. With these in mind, the European Telecommunications Standards Institute (ETSI) has proposed the Multi-access Edge Computing (MEC) architecture, in which computing capabilities can be located close to the network edge, where the data is generated. Provisioning for such applications therefore entails migrating them to the most suitable location on the network edge as the users move. In this thesis, we also tackle this type of provisioning by considering vehicle platooning or Cooperative Adaptive Cruise Control (CACC) on the edge. We show that our Q-Learning algorithm can be adapted to minimize the number of migrations required to effectively run such an application on MEC hosts, which may also be subject to traffic from other competing applications.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Antonio Fernández Anta.- Secretario: Diego Perino.- Vocal: Ilenia Tinnirell

    A Deep Reinforcement Learning based Algorithm for Time and Cost Optimized Scaling of Serverless Applications

    Full text link
    Serverless computing has gained a strong traction in the cloud computing community in recent years. Among the many benefits of this novel computing model, the rapid auto-scaling capability of user applications takes prominence. However, the offer of adhoc scaling of user deployments at function level introduces many complications to serverless systems. The added delay and failures in function request executions caused by the time consumed for dynamically creating new resources to suit function workloads, known as the cold-start delay, is one such very prevalent shortcoming. Maintaining idle resource pools to alleviate this issue often results in wasted resources from the cloud provider perspective. Existing solutions to address this limitation mostly focus on predicting and understanding function load levels in order to proactively create required resources. Although these solutions improve function performance, the lack of understanding on the overall system characteristics in making these scaling decisions often leads to the sub-optimal usage of system resources. Further, the multi-tenant nature of serverless systems requires a scalable solution adaptable for multiple co-existing applications, a limitation seen in most current solutions. In this paper, we introduce a novel multi-agent Deep Reinforcement Learning based intelligent solution for both horizontal and vertical scaling of function resources, based on a comprehensive understanding on both function and system requirements. Our solution elevates function performance reducing cold starts, while also offering the flexibility for optimizing resource maintenance cost to the service providers. Experiments conducted considering varying workload scenarios show improvements of up to 23% and 34% in terms of application latency and request failures, while also saving up to 45% in infrastructure cost for the service providers.Comment: 15 pages, 22 figure

    CoScal: Multi-faceted Scaling of Microservices with Reinforcement Learning

    Get PDF
    The emerging trend towards moving from monolithic applications to microservices has raised new performance challenges in cloud computing environments. Compared with traditional monolithic applications, the microservices are lightweight, fine-grained, and must be executed in a shorter time. Efficient scaling approaches are required to ensure microservices’ system performance under diverse workloads with strict Quality of Service (QoS) requirements and optimize resource provisioning. To solve this problem, we investigate the trade-offs between the dominant scaling techniques, including horizontal scaling, vertical scaling, and brownout in terms of execution cost and response time. We first present a prediction algorithm based on gradient recurrent units to accurately predict workloads assisting in scaling to achieve efficient scaling. Further, we propose a multi-faceted scaling approach using reinforcement learning called CoScal to learn the scaling techniques efficiently. The proposed CoScal approach takes full advantage of data-driven decisions and improves the system performance in terms of high communication cost and delay. We validate our proposed solution by implementing a containerized microservice prototype system and evaluated with two microservice applications. The extensive experiments demonstrate that CoScal reduces response time by 19%-29% and decreases the connection time of services by 16% when compared with the state-of-the-art scaling techniques for Sock Shop application. CoScal can also improve the number of successful transactions with 6%-10% for Stan’s Robot Shop application

    Management and orchestration of virtual network functions via deep reinforcement learning

    Get PDF
    Management and orchestration (MANO) of re-sources by virtual network functions (VNFs) represents one of thekey challenges towards a fully virtualized network architectureas envisaged by 5G standards. Current threshold-based policiesinefficiently over-provision network resources and under-utilizeavailable hardware, incurring high cost for network operators,and consequently, the users. In this work, we present a MANOalgorithm for VNFs allowing a central unit (CU) to learnto autonomously re-configure resources (processing power andstorage), deploy new VNF instances, or offload them to the cloud,depending on the network conditions, available pool of resources,and the VNF requirements, with the goal of minimizing a costfunction that takes into account the economical cost as wellas latency and the quality-of-service (QoS) experienced by theusers. First, we formulate the stochastic resource optimizationproblem as a parameterized action Markov decision process(PAMDP). Then, we propose a solution based on deep reinforce-ment learning (DRL). More precisely, we present a novel RLapproach, called parameterized action twin (PAT) deterministicpolicy gradient, which leverages anactor-critic architecturetolearn to provision resources to the VNFs in an online manner.Finally, we present numerical performance results, and map themto 5G key performance indicators (KPIs). To the best of ourknowledge, this is the first work that considers DRL for MANOof VNFs’ physical resources

    Microservices-based IoT Applications Scheduling in Edge and Fog Computing: A Taxonomy and Future Directions

    Full text link
    Edge and Fog computing paradigms utilise distributed, heterogeneous and resource-constrained devices at the edge of the network for efficient deployment of latency-critical and bandwidth-hungry IoT application services. Moreover, MicroService Architecture (MSA) is increasingly adopted to keep up with the rapid development and deployment needs of the fast-evolving IoT applications. Due to the fine-grained modularity of the microservices along with their independently deployable and scalable nature, MSA exhibits great potential in harnessing both Fog and Cloud resources to meet diverse QoS requirements of the IoT application services, thus giving rise to novel paradigms like Osmotic computing. However, efficient and scalable scheduling algorithms are required to utilise the said characteristics of the MSA while overcoming novel challenges introduced by the architecture. To this end, we present a comprehensive taxonomy of recent literature on microservices-based IoT applications scheduling in Edge and Fog computing environments. Furthermore, we organise multiple taxonomies to capture the main aspects of the scheduling problem, analyse and classify related works, identify research gaps within each category, and discuss future research directions.Comment: 35 pages, 10 figures, submitted to ACM Computing Survey

    A multi-criteria decision making approach for scaling and placement of virtual network functions

    Get PDF
    This paper investigates the joint scaling and placement problem of network services made up of virtual network functions (VNFs) that can be provided inside a cluster managing multiple points of presence (PoPs). Aiming at increasing the VNF service satisfaction rates and minimizing the deployment cost, we use both transport and cloud-aware VNF scaling as well as multi-attribute decision making (MADM) algorithms for VNF placement inside the cluster. The original joint scaling and placement problem is known to be NP-hard and hence the problem is solved by separating scaling and placement problems and solving them individually. The experiments are done using a dataset containing the information of a deployed digital-twin network service. These experiments show that considering transport and cloud parameters during scaling and placement algorithms perform more efficiently than the only cloud based or transport based scaling followed by placement algorithms. One of the MADM algorithms, Total Order Preference by Similarity to the Ideal Solution (TOPSIS), has shown to yield the lowest deployment cost and highest VNF request satisfaction rates compared to only transport or cloud scaling and other investigated MADM algorithms. Our simulation results indicate that considering both transport and cloud parameters in various availability scenarios of cloud and transport resources has significant potential to provide increased request satisfaction rates when VNF scaling and placement using the TOPSIS scheme is performed.This work was partially funded by EC H2020 5GPPP 5Growth Project (Grant 856709), Spanish MINECO Grant TEC2017-88373-R (5G-REFINE), Generalitat de Catalunya Grant 2017 SGR 1195 and the National Program on Equipment and Scientifc and Technical Infrastructure, EQC2018-005257-P under the European Regional Development Fund (FEDER). We would also like to thank Milan Groshev, Carlos GuimarĂŁes for providing dataset for scaling of robot manipulator based digital twin service

    A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems

    Get PDF
    Autoscaling system can reconfigure cloud-based services and applications, through various configurations of cloud sofware and provisions of hardware resources, to adapt to the changing environment at runtime. Such a behavior offers the foundation for achieving elasticity in modern cloud computing paradigm. Given the dynamic and uncertain nature of the shared cloud infrastructure, cloud autoscaling system has been engineered as one of the most complex, sophisticated and intelligent artifacts created by human, aiming to achieve self-aware, self-adaptive and dependable runtime scaling. Yet, existing Self-aware and Self-adaptive Cloud Autoscaling System (SSCAS) is not mature to a state that it can be reliably exploited in the cloud. In this article, we survey the state-of-the-art research studies on SSCAS and provide a comprehensive taxonomy for this feld. We present detailed analysis of the results and provide insights on open challenges, as well as the promising directions that are worth investigated in the future work of this area of research. Our survey and taxonomy contribute to the fundamentals of engineering more intelligent autoscaling systems in the cloud
    • …
    corecore