15 research outputs found
Quasi-Equivalence of Width and Depth of Neural Networks
While classic studies proved that wide networks allow universal
approximation, recent research and successes of deep learning demonstrate the
power of the network depth. Based on a symmetric consideration, we investigate
if the design of artificial neural networks should have a directional
preference, and what the mechanism of interaction is between the width and
depth of a network. We address this fundamental question by establishing a
quasi-equivalence between the width and depth of ReLU networks. Specifically,
we formulate a transformation from an arbitrary ReLU network to a wide network
and a deep network for either regression or classification so that an
essentially same capability of the original network can be implemented. That
is, a deep regression/classification ReLU network has a wide equivalent, and
vice versa, subject to an arbitrarily small error. Interestingly, the
quasi-equivalence between wide and deep classification ReLU networks is a
data-driven version of the De Morgan law
Interpretable Polynomial Neural Ordinary Differential Equations
Neural networks have the ability to serve as universal function
approximators, but they are not interpretable and don't generalize well outside
of their training region. Both of these issues are problematic when trying to
apply standard neural ordinary differential equations (neural ODEs) to
dynamical systems. We introduce the polynomial neural ODE, which is a deep
polynomial neural network inside of the neural ODE framework. We demonstrate
the capability of polynomial neural ODEs to predict outside of the training
region, as well as perform direct symbolic regression without additional tools
such as SINDy
Quadratic neural networks for solving inverse problems
In this paper we investigate the solution of inverse problems with neural
network ansatz functions with generalized decision functions. The relevant
observation for this work is that such functions can approximate typical test
cases, such as the Shepp-Logan phantom, better, than standard neural networks.
Moreover, we show that the convergence analysis of numerical methods for
solving inverse problems with shallow generalized neural network functions
leads to more intuitive convergence conditions, than for deep affine linear
neural networks.Comment: arXiv admin note: text overlap with arXiv:2110.0153
StEik: Stabilizing the Optimization of Neural Signed Distance Functions and Finer Shape Representation
We present new insights and a novel paradigm (StEik) for learning implicit
neural representations (INR) of shapes. In particular, we shed light on the
popular eikonal loss used for imposing a signed distance function constraint in
INR. We show analytically that as the representation power of the network
increases, the optimization approaches a partial differential equation (PDE) in
the continuum limit that is unstable. We show that this instability can
manifest in existing network optimization, leading to irregularities in the
reconstructed surface and/or convergence to sub-optimal local minima, and thus
fails to capture fine geometric and topological structure. We show analytically
how other terms added to the loss, currently used in the literature for other
purposes, can actually eliminate these instabilities. However, such terms can
over-regularize the surface, preventing the representation of fine shape
detail. Based on a similar PDE theory for the continuum limit, we introduce a
new regularization term that still counteracts the eikonal instability but
without over-regularizing. Furthermore, since stability is now guaranteed in
the continuum limit, this stabilization also allows for considering new network
structures that are able to represent finer shape detail. We introduce such a
structure based on quadratic layers. Experiments on multiple benchmark data
sets show that our new regularization and network are able to capture more
precise shape details and more accurate topology than existing
state-of-the-art
Shaping dynamics with multiple populations in low-rank recurrent networks
An emerging paradigm proposes that neural computations can be understood at
the level of dynamical systems that govern low-dimensional trajectories of
collective neural activity. How the connectivity structure of a network
determines the emergent dynamical system however remains to be clarified. Here
we consider a novel class of models, Gaussian-mixture low-rank recurrent
networks, in which the rank of the connectivity matrix and the number of
statistically-defined populations are independent hyper-parameters. We show
that the resulting collective dynamics form a dynamical system, where the rank
sets the dimensionality and the population structure shapes the dynamics. In
particular, the collective dynamics can be described in terms of a simplified
effective circuit of interacting latent variables. While having a single,
global population strongly restricts the possible dynamics, we demonstrate that
if the number of populations is large enough, a rank-R network can approximate
any R-dimensional dynamical system.Comment: 29 pages, 7 figure