1,121 research outputs found
Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet-of-Things (IoT) users, by optimizing offloading decision, transmission power, and resource allocation in the large-scale mobile-edge computing (MEC) system. Toward this end, a deep reinforcement learning (DRL)-based solution is proposed, which includes the following components. First, a related and regularized stacked autoencoder (2r-SAE) with unsupervised learning is applied to perform data compression and representation for high-dimensional channel quality information (CQI) data, which can reduce the state space for DRL. Second, we present an adaptive simulated annealing approach (ASA) as the action search method of DRL, in which an adaptive h -mutation is used to guide the search direction and an adaptive iteration is proposed to enhance the search efficiency during the DRL process. Third, a preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy. The numerical results are provided to demonstrate that the proposed algorithm can achieve near-optimal performance while significantly decreasing the computational time compared with existing benchmarks
Generic Online Learning for Partial Visible & Dynamic Environment with Delayed Feedback
Reinforcement learning (RL) has been applied to robotics and many other domains which a system must learn in real-time and interact with a dynamic environment. In most studies the state- action space that is the key part of RL is predefined. Integration of RL with deep learning method has however taken a tremendous leap forward to solve novel challenging problems such as mastering a board game of Go. The surrounding environment to the agent may not be fully visible, the environment can change over time, and the feedbacks that agent receives for its actions can have a fluctuating delay. In this paper, we propose a Generic Online Learning (GOL) system for such environments. GOL is based on RL with a hierarchical structure to form abstract features in time and adapt to the optimal solutions. The proposed method has been applied to load balancing in 5G cloud random access networks. Simulation results show that GOL successfully achieves the system objectives of reducing cache-misses and communication load, while incurring only limited system overhead in terms of number of high-level patterns needed. We believe that the proposed GOL architecture is significant for future online learning of dynamic, partially visible environments, and would be very useful for many autonomous control systems
A Survey of Adaptive Resonance Theory Neural Network Models for Engineering Applications
This survey samples from the ever-growing family of adaptive resonance theory
(ART) neural network models used to perform the three primary machine learning
modalities, namely, unsupervised, supervised and reinforcement learning. It
comprises a representative list from classic to modern ART models, thereby
painting a general picture of the architectures developed by researchers over
the past 30 years. The learning dynamics of these ART models are briefly
described, and their distinctive characteristics such as code representation,
long-term memory and corresponding geometric interpretation are discussed.
Useful engineering properties of ART (speed, configurability, explainability,
parallelization and hardware implementation) are examined along with current
challenges. Finally, a compilation of online software libraries is provided. It
is expected that this overview will be helpful to new and seasoned ART
researchers
Integrating Symbolic and Neural Processing in a Self-Organizing Architechture for Pattern Recognition and Prediction
British Petroleum (89A-1204); Defense Advanced Research Projects Agency (N00014-92-J-4015); National Science Foundation (IRI-90-00530); Office of Naval Research (N00014-91-J-4100); Air Force Office of Scientific Research (F49620-92-J-0225
Artificial cognitive architecture with self-learning and self-optimization capabilities. Case studies in micromachining processes
Tesis doctoral inédita leÃda en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de IngenierÃa Informática. Fecha de lectura : 22-09-201
Deep reinforcement learning approach for MPPT control of partially shaded PV systems in Smart Grids
Photovoltaic systems (PV) are having an increased importance in modern smart grids systems. Usually, in order to maximize the energy output of the PV arrays a maximum power point tracking (MPPT) algorithm is used. However, once deployed, weather conditions such as clouds can cause shades in the PV arrays affecting the dynamics of each panel differently. These conditions directly affect the available energy output of the arrays and in turn make the MPPT task extremely difficult. For these reasons, under partial shading conditions, it is necessary to have algorithms that are able to learn and adapt online to the changing state of the system. In this work we propose the use of deep reinforcement learning (DRL) techniques to address the MPPT problem of a PV array under partial shading conditions. We develop a model free RL algorithm to maximize the efficiency in MPPT control. The agent's policy is parameterized by neural networks, which take the sensory information as input and directly output the control signal. Furthermore, a PV environment under shading conditions was developed in the open source OpenAI Gym platform and is made available in an open repository. Several tests are performed, using the developed simulated environment, to test the robustness of the proposed control strategies to different climate conditions. The obtained results show the feasibility of our proposal with a successful performance with fast responses and stable behaviors. The best results for the presented methodology show that the maximum operating power point achieved has a deviation less than 1% compared to the theoretical maximum power point.Fil: Avila, Luis Omar. Universidad Nacional de San Luis. Facultad de Ciencias FÃsico Matemáticas y Naturales. Departamento de Informática. Laboratorio Investigación y Desarrollo en Inteligencia Computacional; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas; ArgentinaFil: de Paula, Mariano. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en FÃsica e IngenierÃa del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Tandil. Centro de Investigaciones en FÃsica e IngenierÃa del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones CientÃficas. Centro de Investigaciones en FÃsica e IngenierÃa del Centro de la Provincia de Buenos Aires; ArgentinaFil: Trimboli, Maximiliano Daniel. Universidad Nacional de San Luis. Facultad de IngenierÃa y Ciencias Agropecuarias. Laboratorio de Control Automático; Argentina. Consejo Nacional de Investigaciones CientÃficas y Técnicas; ArgentinaFil: Carlucho, Ignacio. State University of Louisiana; Estados Unidos. Consejo Nacional de Investigaciones CientÃficas y Técnicas; Argentin
- …