32,562 research outputs found
Optimizing simultaneous autoscaling for serverless cloud computing
This paper explores resource allocation in serverless cloud computing
platforms and proposes an optimization approach for autoscaling systems.
Serverless computing relieves users from resource management tasks, enabling
focus on application functions. However, dynamic resource allocation and
function replication based on changing loads remain crucial. Typically,
autoscalers in these platforms utilize threshold-based mechanisms to adjust
function replicas independently. We model applications as interconnected graphs
of functions, where requests probabilistically traverse the graph, triggering
associated function execution. Our objective is to develop a control policy
that optimally allocates resources on servers, minimizing failed requests and
response time in reaction to load changes. Using a fluid approximation model
and Separated Continuous Linear Programming (SCLP), we derive an optimal
control policy that determines the number of resources per replica and the
required number of replicas over time. We evaluate our approach using a
simulation framework built with Python and simpy. Comparing against
threshold-based autoscaling, our approach demonstrates significant improvements
in average response times and failed requests, ranging from 15% to over 300% in
most cases. We also explore the impact of system and workload parameters on
performance, providing insights into the behavior of our optimization approach
under different conditions. Overall, our study contributes to advancing
resource allocation strategies, enhancing efficiency and reliability in
serverless cloud computing platforms
A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning
Automatic decision-making approaches, such as reinforcement learning (RL),
have been applied to (partially) solve the resource allocation problem
adaptively in the cloud computing system. However, a complete cloud resource
allocation framework exhibits high dimensions in state and action spaces, which
prohibit the usefulness of traditional RL techniques. In addition, high power
consumption has become one of the critical concerns in design and control of
cloud computing systems, which degrades system reliability and increases
cooling cost. An effective dynamic power management (DPM) policy should
minimize power consumption while maintaining performance degradation within an
acceptable level. Thus, a joint virtual machine (VM) resource allocation and
power management framework is critical to the overall cloud computing system.
Moreover, novel solution framework is necessary to address the even higher
dimensions in state and action spaces. In this paper, we propose a novel
hierarchical framework for solving the overall resource allocation and power
management problem in cloud computing systems. The proposed hierarchical
framework comprises a global tier for VM resource allocation to the servers and
a local tier for distributed power management of local servers. The emerging
deep reinforcement learning (DRL) technique, which can deal with complicated
control problems with large state space, is adopted to solve the global tier
problem. Furthermore, an autoencoder and a novel weight sharing structure are
adopted to handle the high-dimensional state space and accelerate the
convergence speed. On the other hand, the local tier of distributed server
power managements comprises an LSTM based workload predictor and a model-free
RL based power manager, operating in a distributed manner.Comment: accepted by 37th IEEE International Conference on Distributed
Computing (ICDCS 2017
Cloud computing resource scheduling and a survey of its evolutionary approaches
A disruptive technology fundamentally transforming the way that computing services are delivered, cloud computing offers information and communication technology users a new dimension of convenience of resources, as services via the Internet. Because cloud provides a finite pool of virtualized on-demand resources, optimally scheduling them has become an essential and rewarding topic, where a trend of using Evolutionary Computation (EC) algorithms is emerging rapidly. Through analyzing the cloud computing architecture, this survey first presents taxonomy at two levels of scheduling cloud resources. It then paints a landscape of the scheduling problem and solutions. According to the taxonomy, a comprehensive survey of state-of-the-art approaches is presented systematically. Looking forward, challenges and potential future research directions are investigated and invited, including real-time scheduling, adaptive dynamic scheduling, large-scale scheduling, multiobjective scheduling, and distributed and parallel scheduling. At the dawn of Industry 4.0, cloud computing scheduling for cyber-physical integration with the presence of big data is also discussed. Research in this area is only in its infancy, but with the rapid fusion of information and data technology, more exciting and agenda-setting topics are likely to emerge on the horizon
- …