Search CORE

10 research outputs found

Recommended from our members

Advances in Deep Generative Modeling With Applications to Image Generation and Neuroscience

Author: Loaiza Ganem Gabriel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

Deep generative modeling is an increasingly popular area of machine learning that takes advantage of recent developments in neural networks in order to estimate the distribution of observed data. In this dissertation we introduce three advances in this area. The first one, Maximum Entropy Flow Networks, allows to do maximum entropy modeling by combining normalizing flows with the augmented Lagrangian optimization method. The second one is the continuous Bernoulli, a new [0,1]-supported distribution which we introduce with the motivation of fixing the pervasive error in variational autoencoders of using a Bernoulli likelihood for non-binary data. The last one, Deep Random Splines, is a novel distribution over functions, where samples are obtained by sampling Gaussian noise and transforming it through a neural network to obtain the parameters of a spline. We apply these to model texture images, natural images and neural population data, respectively; and observe significant improvements over current state of the art alternatives

Columbia University Academic Commons

Invertible Gaussian Reparameterization: Revisiting the Gumbel-Softmax

Author: Cunningham John P.
Loaiza-Ganem Gabriel
Potapczynski Andres
Publication venue
Publication date: 11/06/2020
Field of study

The Gumbel-Softmax is a continuous distribution over the simplex that is often used as a relaxation of discrete distributions. Because it can be readily interpreted and easily reparameterized, it enjoys widespread use. We propose a conceptually simpler and more flexible alternative family of reparameterizable distributions where Gaussian noise is transformed into a one-hot approximation through an invertible function. This invertible function is composed of a modified softmax and can incorporate diverse transformations that serve different specific purposes. For example, the stick-breaking procedure allows us to extend the reparameterization trick to distributions with countably infinite support, or normalizing flows let us increase the flexibility of the distribution. Our construction enjoys theoretical advantages over the Gumbel-Softmax, such as closed form KL, and significantly outperforms it in a variety of experiments

arXiv.org e-Print Archive

Relating Regularization and Generalization through the Intrinsic Dimension of Activations

Author: Brown Bradley C. A.
Caterini Anthony L.
Juravsky Jordan
Loaiza-Ganem Gabriel
Publication venue
Publication date: 23/11/2022
Field of study

Given a pair of models with similar training set performance, it is natural to assume that the model that possesses simpler internal representations would exhibit better generalization. In this work, we provide empirical evidence for this intuition through an analysis of the intrinsic dimension (ID) of model activations, which can be thought of as the minimal number of factors of variation in the model's representation of the data. First, we show that common regularization techniques uniformly decrease the last-layer ID (LLID) of validation set activations for image classification models and show how this strongly affects generalization performance. We also investigate how excessive regularization decreases a model's ability to extract features from data in earlier layers, leading to a negative effect on validation accuracy even while LLID continues to decrease and training accuracy remains near-perfect. Finally, we examine the LLID over the course of training of models that exhibit grokking. We observe that well after training accuracy saturates, when models ``grok'' and validation accuracy suddenly improves from random to perfect, there is a co-occurent sudden drop in LLID, thus providing more insight into the dynamics of sudden generalization.Comment: NeurIPS 2022 OPT and HITY workshop

arXiv.org e-Print Archive

CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds

Author: Caterini Anthony L.
Cresswell Jesse C.
Letizia Marco
Loaiza-Ganem Gabriel
Reyes-Gonzalez Humberto
Ross Brendan Leigh
Publication venue
Publication date: 01/01/2022
Field of study

Precision measurements and new physics searches at the Large Hadron Collider require efficient simulations of particle propagation and interactions within the detectors. The most computationally expensive simulations involve calorimeter showers. Advances in deep generative modelling - particularly in the realm of high-dimensional data - have opened the possibility of generating realistic calorimeter showers orders of magnitude more quickly than physics-based simulation. However, the high-dimensional representation of showers belies the relative simplicity and structure of the underlying physical laws. This phenomenon is yet another example of the manifold hypothesis from machine learning, which states that high-dimensional data is supported on low-dimensional manifolds. We thus propose modelling calorimeter showers first by learning their manifold structure, and then estimating the density of data across this manifold. Learning manifold structure reduces the dimensionality of the data, which enables fast training and generation when compared with competing methods.Comment: Accepted to the Machine Learning and the Physical Sciences Workshop at NeurIPS 202

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Genova

The Union of Manifolds Hypothesis and its Implications for Deep Generative Modelling

Author: Brown Bradley C. A.
Caterini Anthony L.
Cresswell Jesse C.
Loaiza-Ganem Gabriel
Ross Brendan Leigh
Publication venue
Publication date: 06/07/2022
Field of study

Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does not properly capture the low-dimensional structure typically present in data. Assuming the data lies on a single manifold implies intrinsic dimension is identical across the entire data space, and does not allow for subregions of this space to have a different number of factors of variation. To address this deficiency, we put forth the union of manifolds hypothesis, which accommodates the existence of non-constant intrinsic dimensions. We empirically verify this hypothesis on commonly-used image datasets, finding that indeed, intrinsic dimension should be allowed to vary. We also show that classes with higher intrinsic dimensions are harder to classify, and how this insight can be used to improve classification accuracy. We then turn our attention to the impact of this hypothesis in the context of deep generative models (DGMs). Most current DGMs struggle to model datasets with several connected components and/or varying intrinsic dimensions. To tackle these shortcomings, we propose clustered DGMs, where we first cluster the data and then train a DGM on each cluster. We show that clustered DGMs can model multiple connected components with different intrinsic dimensions, and empirically outperform their non-clustered counterparts without increasing computational requirements

arXiv.org e-Print Archive