6,458 research outputs found
A Novel Predictive-Coding-Inspired Variational RNN Model for Online Prediction and Recognition
This study introduces PV-RNN, a novel variational RNN inspired by the
predictive-coding ideas. The model learns to extract the probabilistic
structures hidden in fluctuating temporal patterns by dynamically changing the
stochasticity of its latent states. Its architecture attempts to address two
major concerns of variational Bayes RNNs: how can latent variables learn
meaningful representations and how can the inference model transfer future
observations to the latent variables. PV-RNN does both by introducing adaptive
vectors mirroring the training data, whose values can then be adapted
differently during evaluation. Moreover, prediction errors during
backpropagation, rather than external inputs during the forward computation,
are used to convey information to the network about the external data. For
testing, we introduce error regression for predicting unseen sequences as
inspired by predictive coding that leverages those mechanisms. The model
introduces a weighting parameter, the meta-prior, to balance the optimization
pressure placed on two terms of a lower bound on the marginal likelihood of the
sequential data. We test the model on two datasets with probabilistic
structures and show that with high values of the meta-prior the network
develops deterministic chaos through which the data's randomness is imitated.
For low values, the model behaves as a random process. The network performs
best on intermediate values, and is able to capture the latent probabilistic
structure with good generalization. Analyzing the meta-prior's impact on the
network allows to precisely study the theoretical value and practical benefits
of incorporating stochastic dynamics in our model. We demonstrate better
prediction performance on a robot imitation task with our model using error
regression compared to a standard variational Bayes model lacking such a
procedure.Comment: The paper is accepted in Neural Computatio
One-Shot Learning using Mixture of Variational Autoencoders: a Generalization Learning approach
Deep learning, even if it is very successful nowadays, traditionally needs
very large amounts of labeled data to perform excellent on the classification
task. In an attempt to solve this problem, the one-shot learning paradigm,
which makes use of just one labeled sample per class and prior knowledge,
becomes increasingly important. In this paper, we propose a new one-shot
learning method, dubbed MoVAE (Mixture of Variational AutoEncoders), to perform
classification. Complementary to prior studies, MoVAE represents a shift of
paradigm in comparison with the usual one-shot learning methods, as it does not
use any prior knowledge. Instead, it starts from zero knowledge and one labeled
sample per class. Afterward, by using unlabeled data and the generalization
learning concept (in a way, more as humans do), it is capable to gradually
improve by itself its performance. Even more, if there are no unlabeled data
available MoVAE can still perform well in one-shot learning classification. We
demonstrate empirically the efficiency of our proposed approach on three
datasets, i.e. the handwritten digits (MNIST), fashion products
(Fashion-MNIST), and handwritten characters (Omniglot), showing that MoVAE
outperforms state-of-the-art one-shot learning algorithms
Goal-Directed Planning for Habituated Agents by Active Inference Using a Variational Recurrent Neural Network
It is crucial to ask how agents can achieve goals by generating action plans
using only partial models of the world acquired through habituated
sensory-motor experiences. Although many existing robotics studies use a
forward model framework, there are generalization issues with high degrees of
freedom. The current study shows that the predictive coding (PC) and active
inference (AIF) frameworks, which employ a generative model, can develop better
generalization by learning a prior distribution in a low dimensional latent
state space representing probabilistic structures extracted from well
habituated sensory-motor trajectories. In our proposed model, learning is
carried out by inferring optimal latent variables as well as synaptic weights
for maximizing the evidence lower bound, while goal-directed planning is
accomplished by inferring latent variables for maximizing the estimated lower
bound. Our proposed model was evaluated with both simple and complex robotic
tasks in simulation, which demonstrated sufficient generalization in learning
with limited training data by setting an intermediate value for a
regularization coefficient. Furthermore, comparative simulation results show
that the proposed model outperforms a conventional forward model in
goal-directed planning, due to the learned prior confining the search of motor
plans within the range of habituated trajectories.Comment: 30 pages, 19 figure
- …