1,105 research outputs found
Multimodal Transitions for Generative Stochastic Networks
Generative Stochastic Networks (GSNs) have been recently introduced as an
alternative to traditional probabilistic modeling: instead of parametrizing the
data distribution directly, one parametrizes a transition operator for a Markov
chain whose stationary distribution is an estimator of the data generating
distribution. The result of training is therefore a machine that generates
samples through this Markov chain. However, the previously introduced GSN
consistency theorems suggest that in order to capture a wide class of
distributions, the transition operator in general should be multimodal,
something that has not been done before this paper. We introduce for the first
time multimodal transition distributions for GSNs, in particular using models
in the NADE family (Neural Autoregressive Density Estimator) as output
distributions of the transition operator. A NADE model is related to an RBM
(and can thus model multimodal distributions) but its likelihood (and
likelihood gradient) can be computed easily. The parameters of the NADE are
obtained as a learned function of the previous state of the learned Markov
chain. Experiments clearly illustrate the advantage of such multimodal
transition distributions over unimodal GSNs.Comment: 7 figures, 9 pages, submitted to ICLR1
Out-of-Sample Extension for Dimensionality Reduction of Noisy Time Series
This paper proposes an out-of-sample extension framework for a global
manifold learning algorithm (Isomap) that uses temporal information in
out-of-sample points in order to make the embedding more robust to noise and
artifacts. Given a set of noise-free training data and its embedding, the
proposed framework extends the embedding for a noisy time series. This is
achieved by adding a spatio-temporal compactness term to the optimization
objective of the embedding. To the best of our knowledge, this is the first
method for out-of-sample extension of manifold embeddings that leverages timing
information available for the extension set. Experimental results demonstrate
that our out-of-sample extension algorithm renders a more robust and accurate
embedding of sequentially ordered image data in the presence of various noise
and artifacts when compared to other timing-aware embeddings. Additionally, we
show that an out-of-sample extension framework based on the proposed algorithm
outperforms the state of the art in eye-gaze estimation
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
How Auto-Encoders Could Provide Credit Assignment in Deep Networks via Target Propagation
We propose to exploit {\em reconstruction} as a layer-local training signal
for deep learning. Reconstructions can be propagated in a form of target
propagation playing a role similar to back-propagation but helping to reduce
the reliance on derivatives in order to perform credit assignment across many
levels of possibly strong non-linearities (which is difficult for
back-propagation). A regularized auto-encoder tends produce a reconstruction
that is a more likely version of its input, i.e., a small move in the direction
of higher likelihood. By generalizing gradients, target propagation may also
allow to train deep networks with discrete hidden units. If the auto-encoder
takes both a representation of input and target (or of any side information) in
input, then its reconstruction of input representation provides a target
towards a representation that is more likely, conditioned on all the side
information. A deep auto-encoder decoding path generalizes gradient propagation
in a learned way that can could thus handle not just infinitesimal changes but
larger, discrete changes, hopefully allowing credit assignment through a long
chain of non-linear operations. In addition to each layer being a good
auto-encoder, the encoder also learns to please the upper layers by
transforming the data into a space where it is easier to model by them,
flattening manifolds and disentangling factors. The motivations and theoretical
justifications for this approach are laid down in this paper, along with
conjectures that will have to be verified either mathematically or
experimentally, including a hypothesis stating that such auto-encoder mediated
target propagation could play in brains the role of credit assignment through
many non-linear, noisy and discrete transformations
Fast Compressive Sensing Recovery Using Generative Models with Structured Latent Variables
Deep learning models have significantly improved the visual quality and
accuracy on compressive sensing recovery. In this paper, we propose an
algorithm for signal reconstruction from compressed measurements with image
priors captured by a generative model. We search and constrain on latent
variable space to make the method stable when the number of compressed
measurements is extremely limited. We show that, by exploiting certain
structures of the latent variables, the proposed method produces improved
reconstruction accuracy and preserves realistic and non-smooth features in the
image. Our algorithm achieves high computation speed by projecting between the
original signal space and the latent variable space in an alternating fashion
CFSNet: Toward a Controllable Feature Space for Image Restoration
Deep learning methods have witnessed the great progress in image restoration
with specific metrics (e.g., PSNR, SSIM). However, the perceptual quality of
the restored image is relatively subjective, and it is necessary for users to
control the reconstruction result according to personal preferences or image
characteristics, which cannot be done using existing deterministic networks.
This motivates us to exquisitely design a unified interactive framework for
general image restoration tasks. Under this framework, users can control
continuous transition of different objectives, e.g., the perception-distortion
trade-off of image super-resolution, the trade-off between noise reduction and
detail preservation. We achieve this goal by controlling the latent features of
the designed network. To be specific, our proposed framework, named
Controllable Feature Space Network (CFSNet), is entangled by two branches based
on different objectives. Our framework can adaptively learn the coupling
coefficients of different layers and channels, which provides finer control of
the restored image quality. Experiments on several typical image restoration
tasks fully validate the effective benefits of the proposed method. Code is
available at https://github.com/qibao77/CFSNet.Comment: Accepted by ICCV 201
Invertible generative models for inverse problems: mitigating representation error and dataset bias
Trained generative models have shown remarkable performance as priors for
inverse problems in imaging -- for example, Generative Adversarial Network
priors permit recovery of test images from 5-10x fewer measurements than
sparsity priors. Unfortunately, these models may be unable to represent any
particular image because of architectural choices, mode collapse, and bias in
the training dataset. In this paper, we demonstrate that invertible neural
networks, which have zero representation error by design, can be effective
natural signal priors at inverse problems such as denoising, compressive
sensing, and inpainting. Given a trained generative model, we study the
empirical risk formulation of the desired inverse problem under a
regularization that promotes high likelihood images, either directly by
penalization or algorithmically by initialization. For compressive sensing,
invertible priors can yield higher accuracy than sparsity priors across almost
all undersampling ratios, and due to their lack of representation error,
invertible priors can yield better reconstructions than GAN priors for images
that have rare features of variation within the biased training set, including
out-of-distribution natural images. We additionally compare performance for
compressive sensing to unlearned methods, such as the deep decoder, and we
establish theoretical bounds on expected recovery error in the case of a linear
invertible model.Comment: Camera ready version for ICML 2020, paper 265
Constructing Human Motion Manifold with Sequential Networks
This paper presents a novel recurrent neural network-based method to
construct a latent motion manifold that can represent a wide range of human
motions in a long sequence. We introduce several new components to increase the
spatial and temporal coverage in motion space while retaining the details of
motion capture data. These include new regularization terms for the motion
manifold, combination of two complementary decoders for predicting joint
rotations and joint velocities, and the addition of the forward kinematics
layer to consider both joint rotation and position errors. In addition, we
propose a set of loss terms that improve the overall quality of the motion
manifold from various aspects, such as the capability of reconstructing not
only the motion but also the latent manifold vector, and the naturalness of the
motion through adversarial loss. These components contribute to creating
compact and versatile motion manifold that allows for creating new motions by
performing random sampling and algebraic operations, such as interpolation and
analogy, in the latent motion manifold.Comment: 11 pages, It will be published at Computer Graphics Foru
Towards Biologically Plausible Deep Learning
Neuroscientists have long criticised deep learning algorithms as incompatible
with current knowledge of neurobiology. We explore more biologically plausible
versions of deep representation learning, focusing here mostly on unsupervised
learning but developing a learning mechanism that could account for supervised,
unsupervised and reinforcement learning. The starting point is that the basic
learning rule believed to govern synaptic weight updates
(Spike-Timing-Dependent Plasticity) arises out of a simple update rule that
makes a lot of sense from a machine learning point of view and can be
interpreted as gradient descent on some objective function so long as the
neuronal dynamics push firing rates towards better values of the objective
function (be it supervised, unsupervised, or reward-driven). The second main
idea is that this corresponds to a form of the variational EM algorithm, i.e.,
with approximate rather than exact posteriors, implemented by neural dynamics.
Another contribution of this paper is that the gradients required for updating
the hidden states in the above variational interpretation can be estimated
using an approximation that only requires propagating activations forward and
backward, with pairs of layers learning to form a denoising auto-encoder.
Finally, we extend the theory about the probabilistic interpretation of
auto-encoders to justify improved sampling schemes based on the generative
interpretation of denoising auto-encoders, and we validate all these ideas on
generative learning tasks
Probabilistic and Semantic Descriptions of Image Manifolds and Their Applications
This paper begins with a description of methods for estimating probability
density functions for images that reflects the observation that such data is
usually constrained to lie in restricted regions of the high-dimensional image
space - not every pattern of pixels is an image. It is common to say that
images lie on a lower-dimensional manifold in the high-dimensional space.
However, although images may lie on such lower-dimensional manifolds, it is not
the case that all points on the manifold have an equal probability of being
images. Images are unevenly distributed on the manifold, and our task is to
devise ways to model this distribution as a probability distribution. In
pursuing this goal, we consider generative models that are popular in AI and
computer vision community. For our purposes, generative/probabilistic models
should have the properties of 1) sample generation: it should be possible to
sample from this distribution according to the modelled density function, and
2) probability computation: given a previously unseen sample from the dataset
of interest, one should be able to compute the probability of the sample, at
least up to a normalising constant. To this end, we investigate the use of
methods such as normalising flow and diffusion models. We then show that such
probabilistic descriptions can be used to construct defences against
adversarial attacks. In addition to describing the manifold in terms of
density, we also consider how semantic interpretations can be used to describe
points on the manifold. To this end, we consider an emergent language framework
which makes use of variational encoders to produce a disentangled
representation of points that reside on a given manifold. Trajectories between
points on a manifold can then be described in terms of evolving semantic
descriptions.Comment: 23 pages, 17 figures, 1 tabl
- …