3 research outputs found

    Closing the gap: Exact maximum likelihood training of generative autoencoders using invertible layers

    Full text link
    In this work, we provide an exact likelihood alternative to the variational training of generative autoencoders. We show that VAE-style autoencoders can be constructed using invertible layers, which offer a tractable exact likelihood without the need for any regularization terms. This is achieved while leaving complete freedom in the choice of encoder, decoder and prior architectures, making our approach a drop-in replacement for the training of existing VAEs and VAE-style models. We refer to the resulting models as Autoencoders within Flows (AEF), since the encoder, decoder and prior are defined as individual layers of an overall invertible architecture. We show that the approach results in strikingly higher performance than architecturally equivalent VAEs in term of log-likelihood, sample quality and denoising performance. In a broad sense, the main ambition of this work is to close the gap between the normalizing flow and autoencoder literature under the common framework of invertibility and exact maximum likelihood

    Expanding the capabilities of normalizing flows in deep generative models and variational inference

    Get PDF
    Deep generative models and variational Bayesian inference are two frameworks for reasoning about observed high-dimensional data, which may even be combined. A fundamental requirement of either approach is the parametrization of an expressive family of density models. Normalizing flows, sometimes also referred to as invertible neural networks, are one class of models providing this: they are formulated to be bijective and differentiable, and thus produce a tractable density model via the change-of-variable formula. Beyond just deep generative modelling and variational inference, normalizing flows have shown promise as a plug-in density model in other settings such as approximate Bayesian computation and lossless compression. However, the bijectivity constraint can pose quite a restriction on the expressiveness of these approaches, and forces the learned distribution to have full support over the ambient space which is not well-aligned with the common assumption that low-dimensional manifold structure is embedded within high-dimensional data. In this thesis, we challenge this requirement of strict bijectivity over the space of interest to modify normalizing flow models. The first work focuses on the setting of variational inference, defining a normalizing flow based on a discretized time-inhomogeneous Hamiltonian dynamics over an extended position-momentum space. This enables the flow to be guided by the true posterior unlike baseline flow-based models, thus requiring fewer parameters in the inference model to achieve comparable improvements in inference. The next chapter proposes a new deep generative model which relaxes the bijectivity requirement of normalizing flows by injecting learned noise at each layer, sacrificing easy evaluation of the density for expressiveness. We show, theoretically and empirically, the benefits of these models in density estimation over baseline flows. We then demonstrate in the next chapter that the benefits of this model class extend to the setting of variational inference, relying on auxiliary methods to train our models. Finally, the last paper in this thesis addresses the issue of full support in the ambient space and proposes injective flow models directly embedding low-dimensional structure into high dimensions. Our method is the first to optimize the injective change-of-variable term and produces promising results on out-of-distribution detection, which had previous eluded deep generative models. We conclude with some directions for future work and a broader perspective on the field
    corecore