3,037 research outputs found
Allocation of Virtual Machines in Cloud Data Centers - A Survey of Problem Models and Optimization Algorithms
Data centers in public, private, and hybrid cloud settings make it possible to provision virtual machines
(VMs) with unprecedented flexibility. However, purchasing, operating, and maintaining the underlying physical
resources incurs significant monetary costs and also environmental impact. Therefore, cloud providers must
optimize the usage of physical resources by a careful allocation of VMs to hosts, continuously balancing between
the conflicting requirements on performance and operational costs. In recent years, several algorithms have been
proposed for this important optimization problem. Unfortunately, the proposed approaches are hardly comparable
because of subtle differences in the used problem models. This paper surveys the used problem formulations and
optimization algorithms, highlighting their strengths and limitations, also pointing out the areas that need further
research in the future
Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning
This paper addresses the important need for advanced techniques in
continuously allocating workloads on shared infrastructures in data centers, a
problem arising due to the growing popularity and scale of cloud computing. It
particularly emphasizes the scarcity of research ensuring guaranteed capacity
in capacity reservations during large-scale failures. To tackle these issues,
the paper presents scalable solutions for resource management. It builds on the
prior establishment of capacity reservation in cluster management systems and
the two-level resource allocation problem addressed by the Resource Allowance
System (RAS). Recognizing the limitations of Mixed Integer Linear Programming
(MILP) for server assignment in a dynamic environment, this paper proposes the
use of Deep Reinforcement Learning (DRL), which has been successful in
achieving long-term optimal results for time-varying systems. A novel two-level
design that utilizes a DRL-based algorithm is introduced to solve optimal
server-to-reservation assignment, taking into account of fault tolerance,
server movement minimization, and network affinity requirements due to the
impracticality of directly applying DRL algorithms to large-scale instances
with millions of decision variables. The paper explores the interconnection of
these levels and the benefits of such an approach for achieving long-term
optimal results in the context of large-scale cloud systems. We further show in
the experiment section that our two-level DRL approach outperforms the MIP
solver and heuristic approaches and exhibits significantly reduced computation
time compared to the MIP solver. Specifically, our two-level DRL approach
performs 15% better than the MIP solver on minimizing the overall cost. Also,
it uses only 26 seconds to execute 30 rounds of decision making, while the MIP
solver needs nearly an hour
Adaptive Dispatching of Tasks in the Cloud
The increasingly wide application of Cloud Computing enables the
consolidation of tens of thousands of applications in shared infrastructures.
Thus, meeting the quality of service requirements of so many diverse
applications in such shared resource environments has become a real challenge,
especially since the characteristics and workload of applications differ widely
and may change over time. This paper presents an experimental system that can
exploit a variety of online quality of service aware adaptive task allocation
schemes, and three such schemes are designed and compared. These are a
measurement driven algorithm that uses reinforcement learning, secondly a
"sensible" allocation algorithm that assigns jobs to sub-systems that are
observed to provide a lower response time, and then an algorithm that splits
the job arrival stream into sub-streams at rates computed from the hosts'
processing capabilities. All of these schemes are compared via measurements
among themselves and with a simple round-robin scheduler, on two experimental
test-beds with homogeneous and heterogeneous hosts having different processing
capacities.Comment: 10 pages, 9 figure
- …