10 research outputs found

    Invariance and Stability of Deep Convolutional Representations

    Get PDF
    International audienceIn this paper, we study deep signal representations that are near-invariant to groups of transformations and stable to the action of diffeomorphisms without losing signal information. This is achieved by generalizing the multilayer kernel introduced in the context of convolutional kernel networks and by studying the geometry of the corresponding reproducing kernel Hilbert space. We show that the signalrepresentation is stable, and that models from this functional space, such as a large class of convolutional neural networks, may enjoy the same stability

    not-MIWAE: Deep Generative Modelling with Missing not at Random Data

    Full text link
    When a missing process depends on the missing values themselves, it needs to be explicitly modelled and taken into account while doing likelihood-based inference. We present an approach for building and fitting deep latent variable models (DLVMs) in cases where the missing process is dependent on the missing data. Specifically, a deep neural network enables us to flexibly model the conditional distribution of the missingness pattern given the data. This allows for incorporating prior information about the type of missingness (e.g. self-censoring) into the model. Our inference technique, based on importance-weighted variational inference, involves maximising a lower bound of the joint likelihood. Stochastic gradients of the bound are obtained by using the reparameterisation trick both in latent space and data space. We show on various kinds of data sets and missingness patterns that explicitly modelling the missing process can be invaluable.Comment: Camera-ready version for ICLR 202

    Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations

    Get PDF
    The success of deep convolutional architectures is often attributed in part to their ability to learn multiscale and invariant representations of natural signals. However, a precise study of these properties and how they affect learning guarantees is still missing. In this paper, we consider deep convolutional representations of signals; we study their invariance to translations and to more general groups of transformations, their stability to the action of diffeomorphisms, and their ability to preserve signal information. This analysis is carried by introducing a multilayer kernel based on convolutional kernel networks and by studying the geometry induced by the kernel mapping. We then characterize the corresponding reproducing kernel Hilbert space (RKHS), showing that it contains a large class of convolutional neural networks with homogeneous activation functions. This analysis allows us to separate data representation from learning, and to provide a canonical measure of model complexity, the RKHS norm, which controls both stability and generalization of any learned model. In addition to models in the constructed RKHS, our stability analysis also applies to convolutional networks with generic activations such as rectified linear units, and we discuss its relationship with recent generalization bounds based on spectral norms

    Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation

    Full text link
    We study the problem of choosing algorithm hyper-parameters in unsupervised domain adaptation, i.e., with labeled data in a source domain and unlabeled data in a target domain, drawn from a different input distribution. We follow the strategy to compute several models using different hyper-parameters, and, to subsequently compute a linear aggregation of the models. While several heuristics exist that follow this strategy, methods are still missing that rely on thorough theories for bounding the target error. In this turn, we propose a method that extends weighted least squares to vector-valued functions, e.g., deep neural networks. We show that the target error of the proposed algorithm is asymptotically not worse than twice the error of the unknown optimal aggregation. We also perform a large scale empirical comparative study on several datasets, including text, images, electroencephalogram, body sensor signals and signals from mobile phones. Our method outperforms deep embedded validation (DEV) and importance weighted validation (IWV) on all datasets, setting a new state-of-the-art performance for solving parameter choice issues in unsupervised domain adaptation with theoretical error guarantees. We further study several competitive heuristics, all outperforming IWV and DEV on at least five datasets. However, our method outperforms each heuristic on at least five of seven datasets.Comment: Oral talk (notable-top-5%) at International Conference On Learning Representations (ICLR), 202

    Nonlinear Model Reduction of Stochastic Microdynamics

    Get PDF
    This thesis presents a nonlinear model reduction procedure for stochastic microdynamics models that possess mesoscale separation between fast and slow dynamics. Model reduction procedures typically reduce the dimension of deterministic dynamical systems through linear projection operators which offer limited compression capabilities for nonlinear systems. On the other hand, deep neural networks provide a class of nonlinear transformations for regression that can approximate arbitrarily complex functions. The approach developed in this thesis attempts to carry out nonlinear model reduction of stochastic models using deep neural networks to approximate a transformation onto reduced coordinates taken to be the parameters of the network. The stochasticity of the microdynamics is inherited by the reduced, mesoscale model by viewing the parameters as stochastic processes. Moderate time scale separation suggests that non-Gaussian behavior must be considered in contrast with the convergence to Gaussian noise in the limit of infinite timescale separation provided by homogenization theory. This thesis considers several approaches for modeling the stochastic processes concluding with an information geometric strategy for estimating probability distribution functions. The procedure is applied to protein folding within molecular dynamics simulations, a widely used technique to model large collections of atoms which interact through nonlinear forces and are driven by a stochastic heat bath. Protein folding occurs on a larger, mesoscale with respect to the timescale of numerical integration.Doctor of Philosoph

    Invariance and Stability of Deep Convolutional Representations

    No full text
    International audienceIn this paper, we study deep signal representations that are near-invariant to groups of transformations and stable to the action of diffeomorphisms without losing signal information. This is achieved by generalizing the multilayer kernel introduced in the context of convolutional kernel networks and by studying the geometry of the corresponding reproducing kernel Hilbert space. We show that the signalrepresentation is stable, and that models from this functional space, such as a large class of convolutional neural networks, may enjoy the same stability
    corecore