1,004 research outputs found
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning
DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement
learning using deep neural networks. DQNs require a large buffer and batch
processing for an experience replay and rely on a backpropagation based
iterative optimization, making them difficult to be implemented on
resource-limited edge devices. In this paper, we propose a lightweight
on-device reinforcement learning approach for low-cost FPGA devices. It
exploits a recently proposed neural-network based on-device learning approach
that does not rely on the backpropagation method but uses OS-ELM (Online
Sequential Extreme Learning Machine) based training algorithm. In addition, we
propose a combination of L2 regularization and spectral normalization for the
on-device reinforcement learning so that output values of the neural network
can be fit into a certain range and the reinforcement learning becomes stable.
The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a
low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate
that the proposed algorithm and its FPGA implementation complete a CartPole-v0
task 29.77x and 89.40x faster than a conventional DQN-based approach when the
number of hidden-layer nodes is 64
Sparse Bayesian Learning with Diagonal Quasi-Newton Method for Large Scale Classification
Sparse Bayesian Learning (SBL) constructs an extremely sparse probabilistic
model with very competitive generalization. However, SBL needs to invert a big
covariance matrix with complexity O(M^3 ) (M: feature size) for updating the
regularization priors, making it difficult for practical use. There are three
issues in SBL: 1) Inverting the covariance matrix may obtain singular solutions
in some cases, which hinders SBL from convergence; 2) Poor scalability to
problems with high dimensional feature space or large data size; 3) SBL easily
suffers from memory overflow for large-scale data. This paper addresses these
issues with a newly proposed diagonal Quasi-Newton (DQN) method for SBL called
DQN-SBL where the inversion of big covariance matrix is ignored so that the
complexity and memory storage are reduced to O(M). The DQN-SBL is thoroughly
evaluated on non-linear classifiers and linear feature selection using various
benchmark datasets of different sizes. Experimental results verify that DQN-SBL
receives competitive generalization with a very sparse model and scales well to
large-scale problems.Comment: 11 pages,5 figure
CopyCAT: Taking Control of Neural Policies with Constant Attacks
We propose a new perspective on adversarial attacks against deep
reinforcement learning agents. Our main contribution is CopyCAT, a targeted
attack able to consistently lure an agent into following an outsider's policy.
It is pre-computed, therefore fast inferred, and could thus be usable in a
real-time scenario. We show its effectiveness on Atari 2600 games in the novel
read-only setting. In this setting, the adversary cannot directly modify the
agent's state -- its representation of the environment -- but can only attack
the agent's observation -- its perception of the environment. Directly
modifying the agent's state would require a write-access to the agent's inner
workings and we argue that this assumption is too strong in realistic settings.Comment: AAMAS 202
- …