Search CORE

260,232 research outputs found

An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

Author: Matsutani Hiroki
Tsukada Mineto
Watanabe Hirohisa
Publication venue
Publication date: 23/03/2021
Field of study

DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement learning using deep neural networks. DQNs require a large buffer and batch processing for an experience replay and rely on a backpropagation based iterative optimization, making them difficult to be implemented on resource-limited edge devices. In this paper, we propose a lightweight on-device reinforcement learning approach for low-cost FPGA devices. It exploits a recently proposed neural-network based on-device learning approach that does not rely on the backpropagation method but uses OS-ELM (Online Sequential Extreme Learning Machine) based training algorithm. In addition, we propose a combination of L2 regularization and spectral normalization for the on-device reinforcement learning so that output values of the neural network can be fit into a certain range and the reinforcement learning becomes stable. The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate that the proposed algorithm and its FPGA implementation complete a CartPole-v0 task 29.77x and 89.40x faster than a conventional DQN-based approach when the number of hidden-layer nodes is 64

arXiv.org e-Print Archive

Virtual to Real Reinforcement Learning for Autonomous Driving

Author: Lu Cewu
Pan Xinlei
Wang Ziyan
You Yurong
Publication venue
Publication date: 01/01/2017
Field of study

Reinforcement learning is considered as a promising direction for driving policy learning. However, training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error. It is more desirable to first train in a virtual environment and then transfer to the real environment. In this paper, we propose a novel realistic translation network to make model trained in virtual environment be workable in real world. The proposed network can convert non-realistic virtual image input into a realistic one with similar scene structure. Given realistic frames as input, driving policy trained by reinforcement learning can nicely adapt to real world driving. Experiments show that our proposed virtual to real (VR) reinforcement learning (RL) works pretty well. To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data

arXiv.org e-Print Archive

Crossref

Classifying Options for Deep Reinforcement Learning

Author: Arulkumaran Kai
Bharath Anil Anthony
Dilokthanakul Nat
Shanahan Murray
Publication venue
Publication date: 23/05/2016
Field of study

In this paper we combine one method for hierarchical reinforcement learning - the options framework - with deep Q-networks (DQNs) through the use of different "option heads" on the policy network, and a supervisory network for choosing between the different options. We utilise our setup to investigate the effects of architectural constraints in subtasks with positive and negative transfer, across a range of network capacities. We empirically show that our augmented DQN has lower sample complexity when simultaneously learning subtasks with negative transfer, without degrading performance when learning subtasks with positive transfer.Comment: IJCAI 2016 Workshop on Deep Reinforcement Learning: Frontiers and Challenge

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

The Dreaming Variational Autoencoder for Reinforcement Learning Environments

Author: K Arulkumaran
P-A Andersen
RS Sutton
SS Mousavi
V Mnih
Publication venue
Publication date: 01/01/2018
Field of study

Reinforcement learning has shown great potential in generalizing over raw sensory data using only a single neural network for value optimization. There are several challenges in the current state-of-the-art reinforcement learning algorithms that prevent them from converging towards the global optima. It is likely that the solution to these problems lies in short- and long-term planning, exploration and memory management for reinforcement learning algorithms. Games are often used to benchmark reinforcement learning algorithms as they provide a flexible, reproducible, and easy to control environment. Regardless, few games feature a state-space where results in exploration, memory, and planning are easily perceived. This paper presents The Dreaming Variational Autoencoder (DVAE), a neural network based generative modeling architecture for exploration in environments with sparse feedback. We further present Deep Maze, a novel and flexible maze engine that challenges DVAE in partial and fully-observable state-spaces, long-horizon tasks, and deterministic and stochastic problems. We show initial findings and encourage further work in reinforcement learning driven by generative exploration.Comment: Best Student Paper Award, Proceedings of the 38th SGAI International Conference on Artificial Intelligence, Cambridge, UK, 2018, Artificial Intelligence XXXV, 201

arXiv.org e-Print Archive

Crossref

NORA - Norwegian Open Research Archives

Agder University Research Archive

Network Formation with Adaptive Agents

Author: Schuster Stephan
Publication venue
Publication date
Field of study

In this paper, a reinforcement learning version of the connections game first analysed by Jackson and Wolinsky is presented and compared with benchmark results of fully informed and rational players. Using an agent-based simulation approach, the main nding is that the pattern of reinforcement learning process is similar, but does not fully converge to the benchmark results. Before these optimal results can be discovered in a learning process, agents often get locked in a state of random switching or early lock-in.agent-based computational economics; strategic network formation; network games; reinforcement learning

Research Papers in Economics