3,258 research outputs found
DYVERSE: DYnamic VERtical Scaling in Multi-tenant Edge Environments
Multi-tenancy in resource-constrained environments is a key challenge in Edge
computing. In this paper, we develop 'DYVERSE: DYnamic VERtical Scaling in
Edge' environments, which is the first light-weight and dynamic vertical
scaling mechanism for managing resources allocated to applications for
facilitating multi-tenancy in Edge environments. To enable dynamic vertical
scaling, one static and three dynamic priority management approaches that are
workload-aware, community-aware and system-aware, respectively are proposed.
This research advocates that dynamic vertical scaling and priority management
approaches reduce Service Level Objective (SLO) violation rates. An online-game
and a face detection workload in a Cloud-Edge test-bed are used to validate the
research. The merits of DYVERSE is that there is only a sub-second overhead per
Edge server when 32 Edge servers are deployed on a single Edge node. When
compared to executing applications on the Edge servers without dynamic vertical
scaling, static priorities and dynamic priorities reduce SLO violation rates of
requests by up to 4% and 12% for the online game, respectively, and in both
cases 6% for the face detection workload. Moreover, for both workloads, the
system-aware dynamic vertical scaling method effectively reduces the latency of
non-violated requests, when compared to other methods
Machine Learning Algorithms for Provisioning Cloud/Edge Applications
MenciĂłn Internacional en el tĂtulo de doctorReinforcement Learning (RL), in which an agent is trained to make the most
favourable decisions in the long run, is an established technique in artificial intelligence. Its
popularity has increased in the recent past, largely due to the development of deep neural
networks spawning deep reinforcement learning algorithms such as Deep Q-Learning. The
latter have been used to solve previously insurmountable problems, such as playing the
famed game of “Go” that previous algorithms could not. Many such problems suffer the
curse of dimensionality, in which the sheer number of possible states is so overwhelming
that it is impractical to explore every possible option.
While these recent techniques have been successful, they may not be strictly necessary
or practical for some applications such as cloud provisioning. In these situations, the
action space is not as vast and workload data required to train such systems is not
as widely shared, as it is considered commercialy sensitive by the Application Service
Provider (ASP). Given that provisioning decisions evolve over time in sympathy to
incident workloads, they fit into the sequential decision process problem that legacy RL
was designed to solve. However because of the high correlation of time series data, states
are not independent of each other and the legacy Markov Decision Processes (MDPs)
have to be cleverly adapted to create robust provisioning algorithms.
As the first contribution of this thesis, we exploit the knowledge of both the application
and configuration to create an adaptive provisioning system leveraging stationary Markov
distributions. We then develop algorithms that, with neither application nor configuration
knowledge, solve the underlying Markov Decision Process (MDP) to create provisioning
systems. Our Q-Learning algorithms factor in the correlation between states and the
consequent transitions between them to create provisioning systems that do not only
adapt to workloads, but can also exploit similarities between them, thereby reducing
the retraining overhead. Our algorithms also exhibit convergence in fewer learning steps
given that we restructure the state and action spaces to avoid the curse of dimensionality
without the need for the function approximation approach taken by deep Q-Learning
systems.
A crucial use-case of future networks will be the support of low-latency applications
involving highly mobile users. With these in mind, the European Telecommunications Standards Institute (ETSI) has proposed the Multi-access Edge Computing (MEC)
architecture, in which computing capabilities can be located close to the network edge,
where the data is generated. Provisioning for such applications therefore entails migrating
them to the most suitable location on the network edge as the users move. In this thesis,
we also tackle this type of provisioning by considering vehicle platooning or Cooperative
Adaptive Cruise Control (CACC) on the edge. We show that our Q-Learning algorithm
can be adapted to minimize the number of migrations required to effectively run such
an application on MEC hosts, which may also be subject to traffic from other competing
applications.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en IngenierĂa Telemática por la Universidad Carlos III de MadridPresidente: Antonio Fernández Anta.- Secretario: Diego Perino.- Vocal: Ilenia Tinnirell
A Deep Reinforcement Learning based Algorithm for Time and Cost Optimized Scaling of Serverless Applications
Serverless computing has gained a strong traction in the cloud computing
community in recent years. Among the many benefits of this novel computing
model, the rapid auto-scaling capability of user applications takes prominence.
However, the offer of adhoc scaling of user deployments at function level
introduces many complications to serverless systems. The added delay and
failures in function request executions caused by the time consumed for
dynamically creating new resources to suit function workloads, known as the
cold-start delay, is one such very prevalent shortcoming. Maintaining idle
resource pools to alleviate this issue often results in wasted resources from
the cloud provider perspective. Existing solutions to address this limitation
mostly focus on predicting and understanding function load levels in order to
proactively create required resources. Although these solutions improve
function performance, the lack of understanding on the overall system
characteristics in making these scaling decisions often leads to the
sub-optimal usage of system resources. Further, the multi-tenant nature of
serverless systems requires a scalable solution adaptable for multiple
co-existing applications, a limitation seen in most current solutions. In this
paper, we introduce a novel multi-agent Deep Reinforcement Learning based
intelligent solution for both horizontal and vertical scaling of function
resources, based on a comprehensive understanding on both function and system
requirements. Our solution elevates function performance reducing cold starts,
while also offering the flexibility for optimizing resource maintenance cost to
the service providers. Experiments conducted considering varying workload
scenarios show improvements of up to 23% and 34% in terms of application
latency and request failures, while also saving up to 45% in infrastructure
cost for the service providers.Comment: 15 pages, 22 figure
CoScal: Multi-faceted Scaling of Microservices with Reinforcement Learning
The emerging trend towards moving from monolithic applications to microservices has raised new performance challenges in cloud computing environments. Compared with traditional monolithic applications, the microservices are lightweight, fine-grained, and must be executed in a shorter time. Efficient scaling approaches are required to ensure microservices’ system performance under diverse workloads with strict Quality of Service (QoS) requirements and optimize resource provisioning. To solve this problem, we investigate the trade-offs between the dominant scaling techniques, including horizontal scaling, vertical scaling, and brownout in terms of execution cost and response time. We first present a prediction algorithm based on gradient recurrent units to accurately predict workloads assisting in scaling to achieve efficient scaling. Further, we propose a multi-faceted scaling approach using reinforcement learning called CoScal to learn the scaling techniques efficiently. The proposed CoScal approach takes full advantage of data-driven decisions and improves the system performance in terms of high communication cost and delay. We validate our proposed solution by implementing a containerized microservice prototype system and evaluated with two microservice applications. The extensive experiments demonstrate that CoScal reduces response time by 19%-29% and decreases the connection time of services by 16% when compared with the state-of-the-art scaling techniques for Sock Shop application. CoScal can also improve the number of successful transactions with 6%-10% for Stan’s Robot Shop application
Management and orchestration of virtual network functions via deep reinforcement learning
Management and orchestration (MANO) of re-sources by virtual network functions (VNFs) represents one of thekey challenges towards a fully virtualized network architectureas envisaged by 5G standards. Current threshold-based policiesinefficiently over-provision network resources and under-utilizeavailable hardware, incurring high cost for network operators,and consequently, the users. In this work, we present a MANOalgorithm for VNFs allowing a central unit (CU) to learnto autonomously re-configure resources (processing power andstorage), deploy new VNF instances, or offload them to the cloud,depending on the network conditions, available pool of resources,and the VNF requirements, with the goal of minimizing a costfunction that takes into account the economical cost as wellas latency and the quality-of-service (QoS) experienced by theusers. First, we formulate the stochastic resource optimizationproblem as a parameterized action Markov decision process(PAMDP). Then, we propose a solution based on deep reinforce-ment learning (DRL). More precisely, we present a novel RLapproach, called parameterized action twin (PAT) deterministicpolicy gradient, which leverages anactor-critic architecturetolearn to provision resources to the VNFs in an online manner.Finally, we present numerical performance results, and map themto 5G key performance indicators (KPIs). To the best of ourknowledge, this is the first work that considers DRL for MANOof VNFs’ physical resources
Microservices-based IoT Applications Scheduling in Edge and Fog Computing: A Taxonomy and Future Directions
Edge and Fog computing paradigms utilise distributed, heterogeneous and
resource-constrained devices at the edge of the network for efficient
deployment of latency-critical and bandwidth-hungry IoT application services.
Moreover, MicroService Architecture (MSA) is increasingly adopted to keep up
with the rapid development and deployment needs of the fast-evolving IoT
applications. Due to the fine-grained modularity of the microservices along
with their independently deployable and scalable nature, MSA exhibits great
potential in harnessing both Fog and Cloud resources to meet diverse QoS
requirements of the IoT application services, thus giving rise to novel
paradigms like Osmotic computing. However, efficient and scalable scheduling
algorithms are required to utilise the said characteristics of the MSA while
overcoming novel challenges introduced by the architecture. To this end, we
present a comprehensive taxonomy of recent literature on microservices-based
IoT applications scheduling in Edge and Fog computing environments.
Furthermore, we organise multiple taxonomies to capture the main aspects of the
scheduling problem, analyse and classify related works, identify research gaps
within each category, and discuss future research directions.Comment: 35 pages, 10 figures, submitted to ACM Computing Survey
A multi-criteria decision making approach for scaling and placement of virtual network functions
This paper investigates the joint scaling and placement problem of network services made up of virtual network functions (VNFs) that can be provided inside a cluster managing multiple points of presence (PoPs). Aiming at increasing the VNF service satisfaction rates and minimizing the deployment cost, we use both transport and cloud-aware VNF scaling as well as multi-attribute decision making (MADM) algorithms for VNF placement inside the cluster. The original joint scaling and placement problem is known to be NP-hard and hence the problem is solved by separating scaling and placement problems and solving them individually. The experiments are done using a dataset containing the information of a deployed digital-twin network service. These experiments show that considering transport and cloud parameters during scaling and placement algorithms perform more efficiently than the only cloud based or transport based scaling followed by placement algorithms. One of the MADM algorithms, Total Order Preference by Similarity to the Ideal Solution (TOPSIS), has shown to yield the lowest deployment cost and highest VNF request satisfaction rates compared to only transport or cloud scaling and other investigated MADM algorithms. Our simulation results indicate that considering both transport and cloud parameters in various availability scenarios of cloud and transport resources has significant potential to provide increased request satisfaction rates when VNF scaling and placement using the TOPSIS scheme is performed.This work was partially funded by EC H2020 5GPPP 5Growth Project (Grant 856709), Spanish MINECO Grant TEC2017-88373-R (5G-REFINE), Generalitat de Catalunya Grant 2017 SGR 1195 and the National Program on Equipment and Scientifc and Technical Infrastructure, EQC2018-005257-P under the European Regional Development Fund (FEDER). We would also like to thank Milan Groshev, Carlos GuimarĂŁes for providing dataset for scaling of robot manipulator based digital twin service
A survey and taxonomy of self-aware and self-adaptive cloud autoscaling systems
Autoscaling system can reconfigure cloud-based services and applications, through various configurations of cloud sofware and provisions of hardware resources, to adapt to the changing environment at runtime. Such a behavior offers the foundation for achieving elasticity in modern cloud computing paradigm. Given the dynamic and uncertain nature of the shared cloud infrastructure, cloud autoscaling system has been engineered as one of the most complex, sophisticated and intelligent artifacts created by human, aiming to achieve self-aware, self-adaptive and dependable runtime scaling. Yet, existing Self-aware and Self-adaptive Cloud Autoscaling System (SSCAS) is not mature to a state that it can be reliably exploited in the cloud. In this article, we survey the state-of-the-art research studies on SSCAS and provide a comprehensive taxonomy for this feld. We present detailed analysis of the results and provide insights on open challenges, as well as the promising directions that are worth investigated in the future work of this area of research. Our survey and taxonomy contribute to the fundamentals of engineering more intelligent autoscaling systems in the cloud
- …