10,399 research outputs found
Liquid Time-constant Networks
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities,
we construct networks of linear first-order dynamical systems modulated via
nonlinear interlinked gates. The resulting models represent dynamical systems
with varying (i.e., liquid) time-constants coupled to their hidden state, with
outputs being computed by numerical differential equation solvers. These neural
networks exhibit stable and bounded behavior, yield superior expressivity
within the family of neural ordinary differential equations, and give rise to
improved performance on time-series prediction tasks. To demonstrate these
properties, we first take a theoretical approach to find bounds over their
dynamics and compute their expressive power by the trajectory length measure in
latent trajectory space. We then conduct a series of time-series prediction
experiments to manifest the approximation capability of Liquid Time-Constant
Networks (LTCs) compared to classical and modern RNNs. Code and data are
available at https://github.com/raminmh/liquid_time_constant_networksComment: Accepted to the Thirty-Fifth AAAI Conference on Artificial
Intelligence (AAAI-21
Closed-form continuous-time neural networks
Continuous-time neural networks are a class of machine learning systems that can tackle representation learning on spatiotemporal decision-making tasks. These models are typically represented by continuous differential equations. However, their expressive power when they are deployed on computers is bottlenecked by numerical differential equation solvers. This limitation has notably slowed down the scaling and understanding of numerous natural physical phenomena such as the dynamics of nervous systems. Ideally, we would circumvent this bottleneck by solving the given dynamical system in closed form. This is known to be intractable in general. Here, we show that it is possible to closely approximate the interaction between neurons and synapses—the building blocks of natural and artificial neural networks—constructed by liquid time-constant networks efficiently in closed form. To this end, we compute a tightly bounded approximation of the solution of an integral appearing in liquid time-constant dynamics that has had no known closed-form solution so far. This closed-form solution impacts the design of continuous-time and continuous-depth neural models. For instance, since time appears explicitly in closed form, the formulation relaxes the need for complex numerical solvers. Consequently, we obtain models that are between one and five orders of magnitude faster in training and inference compared with differential equation-based counterparts. More importantly, in contrast to ordinary differential equation-based continuous networks, closed-form networks can scale remarkably well compared with other deep learning instances. Lastly, as these models are derived from liquid networks, they show good performance in time-series modelling compared with advanced recurrent neural network models
Universal approximation of input-output maps and dynamical systems by neural network architectures
It is well known that feedforward neural networks can approximate any continuous function supported on a finite-dimensional compact set to arbitrary accuracy. However, many engineering applications require modeling infinite-dimensional functions, such as sequence-to-sequence transformations or input-output characteristics of systems of differential equations. For discrete-time input-output maps having limited long-term memory, we prove universal approximation guarantees for temporal convolutional nets constructed using only a finite number of computation units which hold on an infinite-time horizon. We also provide quantitative estimates for the width and depth of the network sufficient to achieve any fixed error tolerance. Furthemore, we show that discrete-time input-output maps given by state-space realizations satisfying certain stability criteria admit such convolutional net approximations which are accurate on an infinite-time scale. For continuous-time input-output maps induced by dynamical systems that are stable in a similar sense, we prove that continuous-time recurrent neural nets are capable of reproducing the original trajectories to within arbitrarily small error tolerance over an infinite-time horizon. For a subset of these stable systems, we provide quantitative estimates on the number of neurons sufficient to guarantee the desired error bound
Photonic Delay Systems as Machine Learning Implementations
Nonlinear photonic delay systems present interesting implementation platforms
for machine learning models. They can be extremely fast, offer great degrees of
parallelism and potentially consume far less power than digital processors. So
far they have been successfully employed for signal processing using the
Reservoir Computing paradigm. In this paper we show that their range of
applicability can be greatly extended if we use gradient descent with
backpropagation through time on a model of the system to optimize the input
encoding of such systems. We perform physical experiments that demonstrate that
the obtained input encodings work well in reality, and we show that optimized
systems perform significantly better than the common Reservoir Computing
approach. The results presented here demonstrate that common gradient descent
techniques from machine learning may well be applicable on physical
neuro-inspired analog computers
Differential Dynamic Programming for time-delayed systems
Trajectory optimization considers the problem of deciding how to control a
dynamical system to move along a trajectory which minimizes some cost function.
Differential Dynamic Programming (DDP) is an optimal control method which
utilizes a second-order approximation of the problem to find the control. It is
fast enough to allow real-time control and has been shown to work well for
trajectory optimization in robotic systems. Here we extend classic DDP to
systems with multiple time-delays in the state. Being able to find optimal
trajectories for time-delayed systems with DDP opens up the possibility to use
richer models for system identification and control, including recurrent neural
networks with multiple timesteps in the state. We demonstrate the algorithm on
a two-tank continuous stirred tank reactor. We also demonstrate the algorithm
on a recurrent neural network trained to model an inverted pendulum with
position information only.Comment: 7 pages, 6 figures, conference, Decision and Control (CDC), 2016 IEEE
55th Conference o
Random Recurrent Neural Networks Dynamics
This paper is a review dealing with the study of large size random recurrent
neural networks. The connection weights are selected according to a probability
law and it is possible to predict the network dynamics at a macroscopic scale
using an averaging principle. After a first introductory section, the section 1
reviews the various models from the points of view of the single neuron
dynamics and of the global network dynamics. A summary of notations is
presented, which is quite helpful for the sequel. In section 2, mean-field
dynamics is developed.
The probability distribution characterizing global dynamics is computed. In
section 3, some applications of mean-field theory to the prediction of chaotic
regime for Analog Formal Random Recurrent Neural Networks (AFRRNN) are
displayed. The case of AFRRNN with an homogeneous population of neurons is
studied in section 4. Then, a two-population model is studied in section 5. The
occurrence of a cyclo-stationary chaos is displayed using the results of
\cite{Dauce01}. In section 6, an insight of the application of mean-field
theory to IF networks is given using the results of \cite{BrunelHakim99}.Comment: Review paper, 36 pages, 5 figure
Relative entropy minimizing noisy non-linear neural network to approximate stochastic processes
A method is provided for designing and training noise-driven recurrent neural
networks as models of stochastic processes. The method unifies and generalizes
two known separate modeling approaches, Echo State Networks (ESN) and Linear
Inverse Modeling (LIM), under the common principle of relative entropy
minimization. The power of the new method is demonstrated on a stochastic
approximation of the El Nino phenomenon studied in climate research
- …