227 research outputs found

    Globally Injective ReLU Networks

    Get PDF
    Publisher Copyright: © 2022 Michael Puthawala, Konik Kothari, Matti Lassas, Ivan Dokmanic, Maarten de Hoop.Injectivity plays an important role in generative models where it enables inference; in inverse problems and compressed sensing with generative priors it is a precursor to well posedness. We establish sharp characterizations of injectivity of fully-connected and convolutional ReLU layers and networks. First, through a layerwise analysis, we show that an expansivity factor of two is necessary and sufficient for injectivity by constructing appropriate weight matrices. We show that global injectivity with iid Gaussian matrices, a commonly used tractable model, requires larger expansivity between 3.4 and 10.5. We also characterize the stability of inverting an injective network via worst-case Lipschitz constants of the inverse. We then use arguments from differential topology to study injectivity of deep networks and prove that any Lipschitz map can be approximated by an injective ReLU network. Finally, using an argument based on random projections, we show that an end-to-end-rather than layerwise-doubling of the dimension suffices for injectivity. Our results establish a theoretical basis for the study of nonlinear inverse and inference problems using neural networks.Peer reviewe

    Neural Injective Functions for Multisets, Measures and Graphs via a Finite Witness Theorem

    Full text link
    Injective multiset functions have a key role in the theoretical study of machine learning on multisets and graphs. Yet, there remains a gap between the provably injective multiset functions considered in theory, which typically rely on polynomial moments, and the multiset functions used in practice, which rely on neural moments\textit{neural moments} \unicode{x2014} whose injectivity on multisets has not been studied to date. In this paper, we bridge this gap by showing that moments of neural networks do define injective multiset functions, provided that an analytic non-polynomial activation is used. The number of moments required by our theory is optimal essentially up to a multiplicative factor of two. To prove this result, we state and prove a finite witness theorem\textit{finite witness theorem}, which is of independent interest. As a corollary to our main theorem, we derive new approximation results for functions on multisets and measures, and new separation results for graph neural networks. We also provide two negative results: (1) moments of piecewise-linear neural networks cannot be injective multiset functions; and (2) even when moment-based multiset functions are injective, they can never be bi-Lipschitz.Comment: NeurIPS 2023 camera-read

    Trumpets: Injective Flows for Inference and Inverse Problems

    Full text link
    We propose injective generative models called Trumpets that generalize invertible normalizing flows. The proposed generators progressively increase dimension from a low-dimensional latent space. We demonstrate that Trumpets can be trained orders of magnitudes faster than standard flows while yielding samples of comparable or better quality. They retain many of the advantages of the standard flows such as training based on maximum likelihood and a fast, exact inverse of the generator. Since Trumpets are injective and have fast inverses, they can be effectively used for downstream Bayesian inference. To wit, we use Trumpet priors for maximum a posteriori estimation in the context of image reconstruction from compressive measurements, outperforming competitive baselines in terms of reconstruction quality and speed. We then propose an efficient method for posterior characterization and uncertainty quantification with Trumpets by taking advantage of the low-dimensional latent space.Comment: 16 page

    Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction

    Full text link
    The paper uses a frame-theoretic setting to study the injectivity of a ReLU-layer on the closed ball of Rn\mathbb{R}^n and its non-negative part. In particular, the interplay between the radius of the ball and the bias vector is emphasized. Together with a perspective from convex geometry, this leads to a computationally feasible method of verifying the injectivity of a ReLU-layer under reasonable restrictions in terms of an upper bound of the bias vector. Explicit reconstruction formulas are provided, inspired by the duality concept from frame theory. All this gives rise to the possibility of quantifying the invertibility of a ReLU-layer and a concrete reconstruction algorithm for any input vector on the ball.Comment: 10 pages main paper + 2 pages appendix, 4 figures, 2 algorithms, conferenc

    An algebraic theory to discriminate qualia in the brain

    Full text link
    The mind-brain problem is to bridge relations between in higher mental events and in lower neural events. To address this, some mathematical models have been proposed to explain how the brain can represent the discriminative structure of qualia, but they remain unresolved due to a lack of validation methods. To understand the qualia discrimination mechanism, we need to ask how the brain autonomously develops such a mathematical structure using the constructive approach. Here we show that a brain model that learns to satisfy an algebraic independence between neural networks separates metric spaces corresponding to qualia types. We formulate the algebraic independence to link it to the other-qualia-type invariant transformation, a familiar formulation of the permanence of perception. The learning of algebraic independence proposed here explains downward causation, i.e. the macro-level relationship has the causal power over its components, because algebra is the macro-level relationship that is irreducible to a law of neurons, and a self-evaluation of algebra is used to control neurons. The downward causation is required to explain a causal role of mental events on neural events, suggesting that learning algebraic structure between neural networks can contribute to the further development of a mathematical theory of consciousness
    corecore