10 research outputs found
Invariance and Stability of Deep Convolutional Representations
International audienceIn this paper, we study deep signal representations that are near-invariant to groups of transformations and stable to the action of diffeomorphisms without losing signal information. This is achieved by generalizing the multilayer kernel introduced in the context of convolutional kernel networks and by studying the geometry of the corresponding reproducing kernel Hilbert space. We show that the signalrepresentation is stable, and that models from this functional space, such as a large class of convolutional neural networks, may enjoy the same stability
not-MIWAE: Deep Generative Modelling with Missing not at Random Data
When a missing process depends on the missing values themselves, it needs to
be explicitly modelled and taken into account while doing likelihood-based
inference. We present an approach for building and fitting deep latent variable
models (DLVMs) in cases where the missing process is dependent on the missing
data. Specifically, a deep neural network enables us to flexibly model the
conditional distribution of the missingness pattern given the data. This allows
for incorporating prior information about the type of missingness (e.g.
self-censoring) into the model. Our inference technique, based on
importance-weighted variational inference, involves maximising a lower bound of
the joint likelihood. Stochastic gradients of the bound are obtained by using
the reparameterisation trick both in latent space and data space. We show on
various kinds of data sets and missingness patterns that explicitly modelling
the missing process can be invaluable.Comment: Camera-ready version for ICLR 202
Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations
The success of deep convolutional architectures is often attributed in part
to their ability to learn multiscale and invariant representations of natural
signals. However, a precise study of these properties and how they affect
learning guarantees is still missing. In this paper, we consider deep
convolutional representations of signals; we study their invariance to
translations and to more general groups of transformations, their stability to
the action of diffeomorphisms, and their ability to preserve signal
information. This analysis is carried by introducing a multilayer kernel based
on convolutional kernel networks and by studying the geometry induced by the
kernel mapping. We then characterize the corresponding reproducing kernel
Hilbert space (RKHS), showing that it contains a large class of convolutional
neural networks with homogeneous activation functions. This analysis allows us
to separate data representation from learning, and to provide a canonical
measure of model complexity, the RKHS norm, which controls both stability and
generalization of any learned model. In addition to models in the constructed
RKHS, our stability analysis also applies to convolutional networks with
generic activations such as rectified linear units, and we discuss its
relationship with recent generalization bounds based on spectral norms
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation
We study the problem of choosing algorithm hyper-parameters in unsupervised
domain adaptation, i.e., with labeled data in a source domain and unlabeled
data in a target domain, drawn from a different input distribution. We follow
the strategy to compute several models using different hyper-parameters, and,
to subsequently compute a linear aggregation of the models. While several
heuristics exist that follow this strategy, methods are still missing that rely
on thorough theories for bounding the target error. In this turn, we propose a
method that extends weighted least squares to vector-valued functions, e.g.,
deep neural networks. We show that the target error of the proposed algorithm
is asymptotically not worse than twice the error of the unknown optimal
aggregation. We also perform a large scale empirical comparative study on
several datasets, including text, images, electroencephalogram, body sensor
signals and signals from mobile phones. Our method outperforms deep embedded
validation (DEV) and importance weighted validation (IWV) on all datasets,
setting a new state-of-the-art performance for solving parameter choice issues
in unsupervised domain adaptation with theoretical error guarantees. We further
study several competitive heuristics, all outperforming IWV and DEV on at least
five datasets. However, our method outperforms each heuristic on at least five
of seven datasets.Comment: Oral talk (notable-top-5%) at International Conference On Learning
Representations (ICLR), 202
Nonlinear Model Reduction of Stochastic Microdynamics
This thesis presents a nonlinear model reduction procedure for stochastic microdynamics models that possess mesoscale separation between fast and slow dynamics. Model reduction procedures typically reduce the dimension of deterministic dynamical systems through linear projection operators which offer limited compression capabilities for nonlinear systems. On the other hand, deep neural networks provide a class of nonlinear transformations for regression that can approximate arbitrarily complex functions. The approach developed in this thesis attempts to carry out nonlinear model reduction of stochastic models using deep neural networks to approximate a transformation onto reduced coordinates taken to be the parameters of the network. The stochasticity of the microdynamics is inherited by the reduced, mesoscale model by viewing the parameters as stochastic processes. Moderate time scale separation suggests that non-Gaussian behavior must be considered in contrast with the convergence to Gaussian noise in the limit of infinite timescale separation provided by homogenization theory. This thesis considers several approaches for modeling the stochastic processes concluding with an information geometric strategy for estimating probability distribution functions. The procedure is applied to protein folding within molecular dynamics simulations, a widely used technique to model large collections of atoms which interact through nonlinear forces and are driven by a stochastic heat bath. Protein folding occurs on a larger, mesoscale with respect to the timescale of numerical integration.Doctor of Philosoph
Invariance and Stability of Deep Convolutional Representations
International audienceIn this paper, we study deep signal representations that are near-invariant to groups of transformations and stable to the action of diffeomorphisms without losing signal information. This is achieved by generalizing the multilayer kernel introduced in the context of convolutional kernel networks and by studying the geometry of the corresponding reproducing kernel Hilbert space. We show that the signalrepresentation is stable, and that models from this functional space, such as a large class of convolutional neural networks, may enjoy the same stability