587 research outputs found
Computational Capabilities of Analog and Evolving Neural Networks over Infinite Input Streams
International audienceAnalog and evolving recurrent neural networks are super-Turing powerful. Here, we consider analog and evolving neural nets over infinite input streams. We then characterize the topological complexity of their ω-languages as a function of the specific analog or evolving weights that they employ. As a consequence, two infinite hierarchies of classes of analog and evolving neural networks based on the complexity of their underlying weights can be derived. These results constitute an optimal refinement of the super-Turing expressive power of analog and evolving neural networks. They show that analog and evolving neural nets represent natural models for oracle-based infinite computation
An Attractor-Based Complexity Measurement for Boolean Recurrent Neural Networks
We provide a novel refined attractor-based complexity measurement for Boolean recurrent neural networks that represents an assessment of their computational power in terms of the significance of their attractor dynamics. This complexity measurement is achieved by first proving a computational equivalence between Boolean recurrent neural networks and some specific class of -automata, and then translating the most refined classification of -automata to the Boolean neural network context. As a result, a hierarchical classification of Boolean neural networks based on their attractive dynamics is obtained, thus providing a novel refined attractor-based complexity measurement for Boolean recurrent neural networks. These results provide new theoretical insights to the computational and dynamical capabilities of neural networks according to their attractive potentialities. An application of our findings is illustrated by the analysis of the dynamics of a simplified model of the basal ganglia-thalamocortical network simulated by a Boolean recurrent neural network. This example shows the significance of measuring network complexity, and how our results bear new founding elements for the understanding of the complexity of real brain circuits
Shaping dynamical neural computations using spatiotemporal constraints
Dynamics play a critical role in computation. The principled evolution of
states over time enables both biological and artificial networks to represent
and integrate information to make decisions. In the past few decades,
significant multidisciplinary progress has been made in bridging the gap
between how we understand biological versus artificial computation, including
how insights gained from one can translate to the other. Research has revealed
that neurobiology is a key determinant of brain network architecture, which
gives rise to spatiotemporally constrained patterns of activity that underlie
computation. Here, we discuss how neural systems use dynamics for computation,
and claim that the biological constraints that shape brain networks may be
leveraged to improve the implementation of artificial neural networks. To
formalize this discussion, we consider a natural artificial analog of the brain
that has been used extensively to model neural computation: the recurrent
neural network (RNN). In both the brain and the RNN, we emphasize the common
computational substrate atop which dynamics occur -- the connectivity between
neurons -- and we explore the unique computational advantages offered by
biophysical constraints such as resource efficiency, spatial embedding, and
neurodevelopment.Comment: 7 figures, 18 page
Long-term Forecasting using Tensor-Train RNNs
We present Tensor-Train RNN (TT-RNN), a novel family of neural sequence architectures for multivariate forecasting in environments with nonlinear dynamics. Long-term forecasting in such systems is highly challenging, since there exist long-term temporal dependencies, higher-order correlations and sensitivity to error propagation. Our proposed tensor recurrent architecture addresses these issues by learning the nonlinear dynamics directly using higher order moments and high-order state transition functions. Furthermore, we decompose the higher-order structure using the tensor-train (TT) decomposition to reduce the number of parameters while preserving the model performance. We theoretically establish the approximation properties of Tensor-Train RNNs for general sequence inputs, and such guarantees are not available for usual RNNs. We also demonstrate significant long-term prediction improvements over general RNN and LSTM architectures on a range of simulated environments with nonlinear dynamics, as well on real-world climate and traffic data
Generalized Teacher Forcing for Learning Chaotic Dynamics
Chaotic dynamical systems (DS) are ubiquitous in nature and society. Often we
are interested in reconstructing such systems from observed time series for
prediction or mechanistic insight, where by reconstruction we mean learning
geometrical and invariant temporal properties of the system in question (like
attractors). However, training reconstruction algorithms like recurrent neural
networks (RNNs) on such systems by gradient-descent based techniques faces
severe challenges. This is mainly due to exploding gradients caused by the
exponential divergence of trajectories in chaotic systems. Moreover, for
(scientific) interpretability we wish to have as low dimensional
reconstructions as possible, preferably in a model which is mathematically
tractable. Here we report that a surprisingly simple modification of teacher
forcing leads to provably strictly all-time bounded gradients in training on
chaotic systems, and, when paired with a simple architectural rearrangement of
a tractable RNN design, piecewise-linear RNNs (PLRNNs), allows for faithful
reconstruction in spaces of at most the dimensionality of the observed system.
We show on several DS that with these amendments we can reconstruct DS better
than current SOTA algorithms, in much lower dimensions. Performance differences
were particularly compelling on real world data with which most other methods
severely struggled. This work thus led to a simple yet powerful DS
reconstruction algorithm which is highly interpretable at the same time.Comment: Published in the Proceedings of the 40th International Conference on
Machine Learning (ICML 2023
Markov Neural Operators for Learning Chaotic Systems
Chaotic systems are notoriously challenging to predict because of their
instability. Small errors accumulate in the simulation of each time step,
resulting in completely different trajectories. However, the trajectories of
many prominent chaotic systems live in a low-dimensional subspace (attractor).
If the system is Markovian, the attractor is uniquely determined by the Markov
operator that maps the evolution of infinitesimal time steps. This makes it
possible to predict the behavior of the chaotic system by learning the Markov
operator even if we cannot predict the exact trajectory. Recently, a new
framework for learning resolution-invariant solution operators for PDEs was
proposed, known as neural operators. In this work, we train a Markov neural
operator (MNO) with only the local one-step evolution information. We then
compose the learned operator to obtain the global attractor and invariant
measure. Such a Markov neural operator forms a discrete semigroup and we
empirically observe that does not collapse or blow up. Experiments show neural
operators are more accurate and stable compared to previous methods on chaotic
systems such as the Kuramoto-Sivashinsky and Navier-Stokes equations
- …