227 research outputs found
Globally Injective ReLU Networks
Publisher Copyright: © 2022 Michael Puthawala, Konik Kothari, Matti Lassas, Ivan Dokmanic, Maarten de Hoop.Injectivity plays an important role in generative models where it enables inference; in inverse problems and compressed sensing with generative priors it is a precursor to well posedness. We establish sharp characterizations of injectivity of fully-connected and convolutional ReLU layers and networks. First, through a layerwise analysis, we show that an expansivity factor of two is necessary and sufficient for injectivity by constructing appropriate weight matrices. We show that global injectivity with iid Gaussian matrices, a commonly used tractable model, requires larger expansivity between 3.4 and 10.5. We also characterize the stability of inverting an injective network via worst-case Lipschitz constants of the inverse. We then use arguments from differential topology to study injectivity of deep networks and prove that any Lipschitz map can be approximated by an injective ReLU network. Finally, using an argument based on random projections, we show that an end-to-end-rather than layerwise-doubling of the dimension suffices for injectivity. Our results establish a theoretical basis for the study of nonlinear inverse and inference problems using neural networks.Peer reviewe
Neural Injective Functions for Multisets, Measures and Graphs via a Finite Witness Theorem
Injective multiset functions have a key role in the theoretical study of
machine learning on multisets and graphs. Yet, there remains a gap between the
provably injective multiset functions considered in theory, which typically
rely on polynomial moments, and the multiset functions used in practice, which
rely on \unicode{x2014} whose injectivity on
multisets has not been studied to date.
In this paper, we bridge this gap by showing that moments of neural networks
do define injective multiset functions, provided that an analytic
non-polynomial activation is used. The number of moments required by our theory
is optimal essentially up to a multiplicative factor of two. To prove this
result, we state and prove a , which is of
independent interest.
As a corollary to our main theorem, we derive new approximation results for
functions on multisets and measures, and new separation results for graph
neural networks. We also provide two negative results: (1) moments of
piecewise-linear neural networks cannot be injective multiset functions; and
(2) even when moment-based multiset functions are injective, they can never be
bi-Lipschitz.Comment: NeurIPS 2023 camera-read
Trumpets: Injective Flows for Inference and Inverse Problems
We propose injective generative models called Trumpets that generalize
invertible normalizing flows. The proposed generators progressively increase
dimension from a low-dimensional latent space. We demonstrate that Trumpets can
be trained orders of magnitudes faster than standard flows while yielding
samples of comparable or better quality. They retain many of the advantages of
the standard flows such as training based on maximum likelihood and a fast,
exact inverse of the generator. Since Trumpets are injective and have fast
inverses, they can be effectively used for downstream Bayesian inference. To
wit, we use Trumpet priors for maximum a posteriori estimation in the context
of image reconstruction from compressive measurements, outperforming
competitive baselines in terms of reconstruction quality and speed. We then
propose an efficient method for posterior characterization and uncertainty
quantification with Trumpets by taking advantage of the low-dimensional latent
space.Comment: 16 page
Convex Geometry of ReLU-layers, Injectivity on the Ball and Local Reconstruction
The paper uses a frame-theoretic setting to study the injectivity of a
ReLU-layer on the closed ball of and its non-negative part. In
particular, the interplay between the radius of the ball and the bias vector is
emphasized. Together with a perspective from convex geometry, this leads to a
computationally feasible method of verifying the injectivity of a ReLU-layer
under reasonable restrictions in terms of an upper bound of the bias vector.
Explicit reconstruction formulas are provided, inspired by the duality concept
from frame theory. All this gives rise to the possibility of quantifying the
invertibility of a ReLU-layer and a concrete reconstruction algorithm for any
input vector on the ball.Comment: 10 pages main paper + 2 pages appendix, 4 figures, 2 algorithms,
conferenc
An algebraic theory to discriminate qualia in the brain
The mind-brain problem is to bridge relations between in higher mental events
and in lower neural events. To address this, some mathematical models have been
proposed to explain how the brain can represent the discriminative structure of
qualia, but they remain unresolved due to a lack of validation methods. To
understand the qualia discrimination mechanism, we need to ask how the brain
autonomously develops such a mathematical structure using the constructive
approach. Here we show that a brain model that learns to satisfy an algebraic
independence between neural networks separates metric spaces corresponding to
qualia types. We formulate the algebraic independence to link it to the
other-qualia-type invariant transformation, a familiar formulation of the
permanence of perception. The learning of algebraic independence proposed here
explains downward causation, i.e. the macro-level relationship has the causal
power over its components, because algebra is the macro-level relationship that
is irreducible to a law of neurons, and a self-evaluation of algebra is used to
control neurons. The downward causation is required to explain a causal role of
mental events on neural events, suggesting that learning algebraic structure
between neural networks can contribute to the further development of a
mathematical theory of consciousness
- …