1,254 research outputs found
Recurrent Neural Filters: Learning Independent Bayesian Filtering Steps for Time Series Prediction
Despite the recent popularity of deep generative state space models, few
comparisons have been made between network architectures and the inference
steps of the Bayesian filtering framework -- with most models simultaneously
approximating both state transition and update steps with a single recurrent
neural network (RNN). In this paper, we introduce the Recurrent Neural Filter
(RNF), a novel recurrent autoencoder architecture that learns distinct
representations for each Bayesian filtering step, captured by a series of
encoders and decoders. Testing this on three real-world time series datasets,
we demonstrate that the decoupled representations learnt not only improve the
accuracy of one-step-ahead forecasts while providing realistic uncertainty
estimates, but also facilitate multistep prediction through the separation of
encoder stages
Sparsity in Variational Autoencoders
Working in high-dimensional latent spaces, the internal encoding of data in
Variational Autoencoders becomes naturally sparse. We discuss this known but
controversial phenomenon sometimes refereed to as overpruning, to emphasize the
under-use of the model capacity. In fact, it is an important form of
self-regularization, with all the typical benefits associated with sparsity: it
forces the model to focus on the really important features, highly reducing the
risk of overfitting. Especially, it is a major methodological guide for the
correct tuning of the model capacity, progressively augmenting it to attain
sparsity, or conversely reducing the dimension of the network removing links to
zeroed out neurons. The degree of sparsity crucially depends on the network
architecture: for instance, convolutional networks typically show less
sparsity, likely due to the tighter relation of features to different spatial
regions of the input.Comment: An Extended Abstract of this survey will be presented at the 1st
International Conference on Advances in Signal Processing and Artificial
Intelligence (ASPAI' 2019), 20-22 March 2019, Barcelona, Spai
BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modeling
With the introduction of the variational autoencoder (VAE), probabilistic
latent variable models have received renewed attention as powerful generative
models. However, their performance in terms of test likelihood and quality of
generated samples has been surpassed by autoregressive models without
stochastic units. Furthermore, flow-based models have recently been shown to be
an attractive alternative that scales well to high-dimensional data. In this
paper we close the performance gap by constructing VAE models that can
effectively utilize a deep hierarchy of stochastic variables and model complex
covariance structures. We introduce the Bidirectional-Inference Variational
Autoencoder (BIVA), characterized by a skip-connected generative model and an
inference network formed by a bidirectional stochastic inference path. We show
that BIVA reaches state-of-the-art test likelihoods, generates sharp and
coherent natural images, and uses the hierarchy of latent variables to capture
different aspects of the data distribution. We observe that BIVA, in contrast
to recent results, can be used for anomaly detection. We attribute this to the
hierarchy of latent variables which is able to extract high-level semantic
features. Finally, we extend BIVA to semi-supervised classification tasks and
show that it performs comparably to state-of-the-art results by generative
adversarial networks
A stable variational autoencoder for text modelling
Variational Autoencoder (VAE) is a powerful method for learning representations of high-dimensional data. However, VAEs can suffer from an issue known as latent variable collapse (or KL term vanishing), where the posterior collapses to the prior and the model will ignore the latent codes in generative tasks. Such an issue is particularly prevalent when employing VAE-RNN architectures for text modelling (Bowman et al., 2016; Yang et al., 2017). In this paper, we present a new architecture called Full-Sampling-VAE-RNN, which can effectively avoid latent variable collapse. Compared to the general VAE-RNN architectures, we show that our model can achieve much more stable training process and can generate text with significantly better quality
- …