128 research outputs found
Neural Circuit Architectural Priors for Embodied Control
Artificial neural networks for motor control usually adopt generic
architectures like fully connected MLPs. While general, these tabula rasa
architectures rely on large amounts of experience to learn, are not easily
transferable to new bodies, and have internal dynamics that are difficult to
interpret. In nature, animals are born with highly structured connectivity in
their nervous systems shaped by evolution; this innate circuitry acts
synergistically with learning mechanisms to provide inductive biases that
enable most animals to function well soon after birth and learn efficiently.
Convolutional networks inspired by visual circuitry have encoded useful biases
for vision. However, it is unknown the extent to which ANN architectures
inspired by neural circuitry can yield useful biases for other AI domains. In
this work, we ask what advantages biologically inspired ANN architecture can
provide in the domain of motor control. Specifically, we translate C. elegans
locomotion circuits into an ANN model controlling a simulated Swimmer agent. On
a locomotion task, our architecture achieves good initial performance and
asymptotic performance comparable with MLPs, while dramatically improving data
efficiency and requiring orders of magnitude fewer parameters. Our architecture
is interpretable and transfers to new body designs. An ablation analysis shows
that constrained excitation/inhibition is crucial for learning, while weight
initialization contributes to good initial performance. Our work demonstrates
several advantages of biologically inspired ANN architecture and encourages
future work in more complex embodied control.Comment: NeurIPS 202
PMLR
We propose a neural information processing system obtained by re-purposing the function of a biological neural circuit model to govern simulated and real-world control tasks. Inspired by the structure of the nervous system of the soil-worm, C. elegans, we introduce ordinary neural circuits (ONCs), defined as the model of biological neural circuits reparameterized for the control of alternative tasks. We first demonstrate that ONCs realize networks with higher maximum flow compared to arbitrary wired networks. We then learn instances of ONCs to control a series of robotic tasks, including the autonomous parking of a real-world rover robot. For reconfiguration of the purpose of the neural circuit, we adopt a search-based optimization algorithm. Ordinary neural circuits perform on par and, in some cases, significantly surpass the performance of contemporary deep learning models. ONC networks are compact, 77% sparser than their counterpart neural controllers, and their neural dynamics are fully interpretable at the cell-level
Lagrangian Reachtubes: The Next Generation
We introduce LRT-NG, a set of techniques and an associated toolset that
computes a reachtube (an over-approximation of the set of reachable states over
a given time horizon) of a nonlinear dynamical system. LRT-NG significantly
advances the state-of-the-art Langrangian Reachability and its associated tool
LRT. From a theoretical perspective, LRT-NG is superior to LRT in three ways.
First, it uses for the first time an analytically computed metric for the
propagated ball which is proven to minimize the ball's volume. We emphasize
that the metric computation is the centerpiece of all bloating-based
techniques. Secondly, it computes the next reachset as the intersection of two
balls: one based on the Cartesian metric and the other on the new metric. While
the two metrics were previously considered opposing approaches, their joint use
considerably tightens the reachtubes. Thirdly, it avoids the "wrapping effect"
associated with the validated integration of the center of the reachset, by
optimally absorbing the interval approximation in the radius of the next ball.
From a tool-development perspective, LRT-NG is superior to LRT in two ways.
First, it is a standalone tool that no longer relies on CAPD. This required the
implementation of the Lohner method and a Runge-Kutta time-propagation method.
Secondly, it has an improved interface, allowing the input model and initial
conditions to be provided as external input files. Our experiments on a
comprehensive set of benchmarks, including two Neural ODEs, demonstrates
LRT-NG's superior performance compared to LRT, CAPD, and Flow*.Comment: 12 pages, 14 figure
PMLR
Robustness to variations in lighting conditions is a key objective for any deep vision system. To this end, our paper extends the receptive field of convolutional neural networks with two residual components, ubiquitous in the visual processing system of vertebrates: On-center and off-center pathways, with an excitatory center and inhibitory surround; OOCS for short. The On-center pathway is excited by the presence of a light stimulus in its center, but not in its surround, whereas the Off-center pathway is excited by the absence of a light stimulus in its center, but not in its surround. We design OOCS pathways via a difference of Gaussians, with their variance computed analytically from the size of the receptive fields. OOCS pathways complement each other in their response to light stimuli, ensuring this way a strong edge-detection capability, and as a result an accurate and robust inference under challenging lighting conditions. We provide extensive empirical evidence showing that networks supplied with OOCS pathways gain accuracy and illumination-robustness from the novel edge representation, compared to other baselines
Closed-form Continuous-Depth Models
Continuous-depth neural models, where the derivative of the model's hidden
state is defined by a neural network, have enabled strong sequential data
processing capabilities. However, these models rely on advanced numerical
differential equation (DE) solvers resulting in a significant overhead both in
terms of computational cost and model complexity. In this paper, we present a
new family of models, termed Closed-form Continuous-depth (CfC) networks, that
are simple to describe and at least one order of magnitude faster while
exhibiting equally strong modeling abilities compared to their ODE-based
counterparts. The models are hereby derived from the analytical closed-form
solution of an expressive subset of time-continuous models, thus alleviating
the need for complex DE solvers all together. In our experimental evaluations,
we demonstrate that CfC networks outperform advanced, recurrent models over a
diverse set of time-series prediction tasks, including those with long-term
dependencies and irregularly sampled data. We believe our findings open new
opportunities to train and deploy rich, continuous neural models in
resource-constrained settings, which demand both performance and efficiency.Comment: 17 page
Closed-form continuous-time neural networks
Continuous-time neural networks are a class of machine learning systems that can tackle representation learning on spatiotemporal decision-making tasks. These models are typically represented by continuous differential equations. However, their expressive power when they are deployed on computers is bottlenecked by numerical differential equation solvers. This limitation has notably slowed down the scaling and understanding of numerous natural physical phenomena such as the dynamics of nervous systems. Ideally, we would circumvent this bottleneck by solving the given dynamical system in closed form. This is known to be intractable in general. Here, we show that it is possible to closely approximate the interaction between neurons and synapses—the building blocks of natural and artificial neural networks—constructed by liquid time-constant networks efficiently in closed form. To this end, we compute a tightly bounded approximation of the solution of an integral appearing in liquid time-constant dynamics that has had no known closed-form solution so far. This closed-form solution impacts the design of continuous-time and continuous-depth neural models. For instance, since time appears explicitly in closed form, the formulation relaxes the need for complex numerical solvers. Consequently, we obtain models that are between one and five orders of magnitude faster in training and inference compared with differential equation-based counterparts. More importantly, in contrast to ordinary differential equation-based continuous networks, closed-form networks can scale remarkably well compared with other deep learning instances. Lastly, as these models are derived from liquid networks, they show good performance in time-series modelling compared with advanced recurrent neural network models
Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap
There is an ever-growing zoo of modern neural network models that can
efficiently learn end-to-end control from visual observations. These advanced
deep models, ranging from convolutional to patch-based networks, have been
extensively tested on offline image classification and regression tasks. In
this paper, we study these vision architectures with respect to the open-loop
to closed-loop causality gap, i.e., offline training followed by an online
closed-loop deployment. This causality gap typically emerges in robotics
applications such as autonomous driving, where a network is trained to imitate
the control commands of a human. In this setting, two situations arise: 1)
Closed-loop testing in-distribution, where the test environment shares
properties with those of offline training data. 2) Closed-loop testing under
distribution shifts and out-of-distribution. Contrary to recently reported
results, we show that under proper training guidelines, all vision models
perform indistinguishably well on in-distribution deployment, resolving the
causality gap. In situation 2, We observe that the causality gap disrupts
performance regardless of the choice of the model architecture. Our results
imply that the causality gap can be solved in situation one with our proposed
training guideline with any modern network architecture, whereas achieving
out-of-distribution generalization (situation two) requires further
investigations, for instance, on data diversity rather than the model
architecture
- …