903,423 research outputs found
A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning
Automatic decision-making approaches, such as reinforcement learning (RL),
have been applied to (partially) solve the resource allocation problem
adaptively in the cloud computing system. However, a complete cloud resource
allocation framework exhibits high dimensions in state and action spaces, which
prohibit the usefulness of traditional RL techniques. In addition, high power
consumption has become one of the critical concerns in design and control of
cloud computing systems, which degrades system reliability and increases
cooling cost. An effective dynamic power management (DPM) policy should
minimize power consumption while maintaining performance degradation within an
acceptable level. Thus, a joint virtual machine (VM) resource allocation and
power management framework is critical to the overall cloud computing system.
Moreover, novel solution framework is necessary to address the even higher
dimensions in state and action spaces. In this paper, we propose a novel
hierarchical framework for solving the overall resource allocation and power
management problem in cloud computing systems. The proposed hierarchical
framework comprises a global tier for VM resource allocation to the servers and
a local tier for distributed power management of local servers. The emerging
deep reinforcement learning (DRL) technique, which can deal with complicated
control problems with large state space, is adopted to solve the global tier
problem. Furthermore, an autoencoder and a novel weight sharing structure are
adopted to handle the high-dimensional state space and accelerate the
convergence speed. On the other hand, the local tier of distributed server
power managements comprises an LSTM based workload predictor and a model-free
RL based power manager, operating in a distributed manner.Comment: accepted by 37th IEEE International Conference on Distributed
Computing (ICDCS 2017
Resource Management in Grid Computing: A Review
A Network Computing System is a virtual computer formed by a networked set of heterogeneous machines that agree to share their local resources with each other. A grid is a very large scale network computing system that scales to internet size environments with machines distributed across multiple organizationsand administrative domains. The resource management system is the central component of grid computing system. Resources in the grid are distributed, heterogeneous, autonomous and unpredictable. A resource management system matches requests to resources, schedules the matched resources, and executes the requests using scheduled resources. Scheduling in the grid environment depends upon the characteristics of the tasks, machines and network connectivity. The paper provides a brief overview of resource management in grid computing considering important factors such as types of resource management in grid computing, resource management models and comparison of various scheduling algorithm in resource management in grid computing
A Case for Cooperative and Incentive-Based Coupling of Distributed Clusters
Research interest in Grid computing has grown significantly over the past
five years. Management of distributed resources is one of the key issues in
Grid computing. Central to management of resources is the effectiveness of
resource allocation as it determines the overall utility of the system. The
current approaches to superscheduling in a grid environment are non-coordinated
since application level schedulers or brokers make scheduling decisions
independently of the others in the system. Clearly, this can exacerbate the
load sharing and utilization problems of distributed resources due to
suboptimal schedules that are likely to occur. To overcome these limitations,
we propose a mechanism for coordinated sharing of distributed clusters based on
computational economy. The resulting environment, called
\emph{Grid-Federation}, allows the transparent use of resources from the
federation when local resources are insufficient to meet its users'
requirements. The use of computational economy methodology in coordinating
resource allocation not only facilitates the QoS based scheduling, but also
enhances utility delivered by resources.Comment: 22 pages, extended version of the conference paper published at IEEE
Cluster'05, Boston, M
Simulation of Tasks Distribution in Horizontally Scalable Management System
This paper presents an imitational model of the task distribution system for the components of territorially-distributed automated management system with a dynamically changing topology. Each resource of the distributed automated management system is represented with an agent, which allows to set behavior of every resource in the best possible way and ensure their interaction. The agent work load imitation was done via service query imitation formed in a system dynamics style using a stream diagram. The query generation took place in the abstract-represented center - afterwards, they were sent to the drive to be distributed to management system resources according to a ranking table
An SMDP-based Resource Management Scheme for Distributed Cloud Systems
In this paper, the resource management problem in geographically distributed
cloud systems is considered. The Follow Me Cloud concept which enables service
migration across federated data centers (DCs) is adopted. Therefore, there are
two types of service requests to the DC, i.e., new requests (NRs) initiated in
the local service area and migration requests (MRs) generated when mobile users
move across service areas. A novel resource management scheme is proposed to
help the resource manager decide whether to accept the service requests (NRs or
MRs) or not and determine how much resources should be allocated to each
service (if accepted). The optimization objective is to maximize the average
system reward and keep the rejection probability of service requests under a
certain threshold. Numerical results indicate that the proposed scheme can
significantly improve the overall system utility as well as the user experience
compared with other resource management schemes.Comment: 5 pages, 5 figures, conferenc
Tools for distributed application management
Distributed application management consists of monitoring and controlling an application as it executes in a distributed environment. It encompasses such activities as configuration, initialization, performance monitoring, resource scheduling, and failure response. The Meta system (a collection of tools for constructing distributed application management software) is described. Meta provides the mechanism, while the programmer specifies the policy for application management. The policy is manifested as a control program which is a soft real-time reactive program. The underlying application is instrumented with a variety of built-in and user-defined sensors and actuators. These define the interface between the control program and the application. The control program also has access to a database describing the structure of the application and the characteristics of its environment. Some of the more difficult problems for application management occur when preexisting, nondistributed programs are integrated into a distributed application for which they may not have been intended. Meta allows management functions to be retrofitted to such programs with a minimum of effort
Planning and Resource Management in an Intelligent Automated Power Management System
Power system management is a process of guiding a power system towards the objective of continuous supply of electrical power to a set of loads. Spacecraft power system management requires planning and scheduling, since electrical power is a scarce resource in space. The automation of power system management for future spacecraft has been recognized as an important R&D goal. Several automation technologies have emerged including the use of expert systems for automating human problem solving capabilities such as rule based expert system for fault diagnosis and load scheduling. It is questionable whether current generation expert system technology is applicable for power system management in space. The objective of the ADEPTS (ADvanced Electrical Power management Techniques for Space systems) is to study new techniques for power management automation. These techniques involve integrating current expert system technology with that of parallel and distributed computing, as well as a distributed, object-oriented approach to software design. The focus of the current study is the integration of new procedures for automatically planning and scheduling loads with procedures for performing fault diagnosis and control. The objective is the concurrent execution of both sets of tasks on separate transputer processors, thus adding parallelism to the overall management process
- …