48,932 research outputs found
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning
DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement
learning using deep neural networks. DQNs require a large buffer and batch
processing for an experience replay and rely on a backpropagation based
iterative optimization, making them difficult to be implemented on
resource-limited edge devices. In this paper, we propose a lightweight
on-device reinforcement learning approach for low-cost FPGA devices. It
exploits a recently proposed neural-network based on-device learning approach
that does not rely on the backpropagation method but uses OS-ELM (Online
Sequential Extreme Learning Machine) based training algorithm. In addition, we
propose a combination of L2 regularization and spectral normalization for the
on-device reinforcement learning so that output values of the neural network
can be fit into a certain range and the reinforcement learning becomes stable.
The proposed reinforcement learning approach is designed for PYNQ-Z1 board as a
low-cost FPGA platform. The evaluation results using OpenAI Gym demonstrate
that the proposed algorithm and its FPGA implementation complete a CartPole-v0
task 29.77x and 89.40x faster than a conventional DQN-based approach when the
number of hidden-layer nodes is 64
Gated Recurrent Neural Tensor Network
Recurrent Neural Networks (RNNs), which are a powerful scheme for modeling
temporal and sequential data need to capture long-term dependencies on datasets
and represent them in hidden layers with a powerful model to capture more
information from inputs. For modeling long-term dependencies in a dataset, the
gating mechanism concept can help RNNs remember and forget previous
information. Representing the hidden layers of an RNN with more expressive
operations (i.e., tensor products) helps it learn a more complex relationship
between the current input and the previous hidden layer information. These
ideas can generally improve RNN performances. In this paper, we proposed a
novel RNN architecture that combine the concepts of gating mechanism and the
tensor product into a single model. By combining these two concepts into a
single RNN, our proposed models learn long-term dependencies by modeling with
gating units and obtain more expressive and direct interaction between input
and hidden layers using a tensor product on 3-dimensional array (tensor) weight
parameters. We use Long Short Term Memory (LSTM) RNN and Gated Recurrent Unit
(GRU) RNN and combine them with a tensor product inside their formulations. Our
proposed RNNs, which are called a Long-Short Term Memory Recurrent Neural
Tensor Network (LSTMRNTN) and Gated Recurrent Unit Recurrent Neural Tensor
Network (GRURNTN), are made by combining the LSTM and GRU RNN models with the
tensor product. We conducted experiments with our proposed models on word-level
and character-level language modeling tasks and revealed that our proposed
models significantly improved their performance compared to our baseline
models.Comment: Accepted at IJCNN 2016 URL :
http://ieeexplore.ieee.org/document/7727233
Sequential Recurrent Neural Networks for Language Modeling
Feedforward Neural Network (FNN)-based language models estimate the
probability of the next word based on the history of the last N words, whereas
Recurrent Neural Networks (RNN) perform the same task based only on the last
word and some context information that cycles in the network. This paper
presents a novel approach, which bridges the gap between these two categories
of networks. In particular, we propose an architecture which takes advantage of
the explicit, sequential enumeration of the word history in FNN structure while
enhancing each word representation at the projection layer through recurrent
context information that evolves in the network. The context integration is
performed using an additional word-dependent weight matrix that is also learned
during the training. Extensive experiments conducted on the Penn Treebank (PTB)
and the Large Text Compression Benchmark (LTCB) corpus showed a significant
reduction of the perplexity when compared to state-of-the-art feedforward as
well as recurrent neural network architectures.Comment: published (INTERSPEECH 2016), 5 pages, 3 figures, 4 table
- …