10,399 research outputs found

    Liquid Time-constant Networks

    Full text link
    We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. The resulting models represent dynamical systems with varying (i.e., liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations, and give rise to improved performance on time-series prediction tasks. To demonstrate these properties, we first take a theoretical approach to find bounds over their dynamics and compute their expressive power by the trajectory length measure in latent trajectory space. We then conduct a series of time-series prediction experiments to manifest the approximation capability of Liquid Time-Constant Networks (LTCs) compared to classical and modern RNNs. Code and data are available at https://github.com/raminmh/liquid_time_constant_networksComment: Accepted to the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21

    Closed-form continuous-time neural networks

    Get PDF
    Continuous-time neural networks are a class of machine learning systems that can tackle representation learning on spatiotemporal decision-making tasks. These models are typically represented by continuous differential equations. However, their expressive power when they are deployed on computers is bottlenecked by numerical differential equation solvers. This limitation has notably slowed down the scaling and understanding of numerous natural physical phenomena such as the dynamics of nervous systems. Ideally, we would circumvent this bottleneck by solving the given dynamical system in closed form. This is known to be intractable in general. Here, we show that it is possible to closely approximate the interaction between neurons and synapses—the building blocks of natural and artificial neural networks—constructed by liquid time-constant networks efficiently in closed form. To this end, we compute a tightly bounded approximation of the solution of an integral appearing in liquid time-constant dynamics that has had no known closed-form solution so far. This closed-form solution impacts the design of continuous-time and continuous-depth neural models. For instance, since time appears explicitly in closed form, the formulation relaxes the need for complex numerical solvers. Consequently, we obtain models that are between one and five orders of magnitude faster in training and inference compared with differential equation-based counterparts. More importantly, in contrast to ordinary differential equation-based continuous networks, closed-form networks can scale remarkably well compared with other deep learning instances. Lastly, as these models are derived from liquid networks, they show good performance in time-series modelling compared with advanced recurrent neural network models

    Universal approximation of input-output maps and dynamical systems by neural network architectures

    Get PDF
    It is well known that feedforward neural networks can approximate any continuous function supported on a finite-dimensional compact set to arbitrary accuracy. However, many engineering applications require modeling infinite-dimensional functions, such as sequence-to-sequence transformations or input-output characteristics of systems of differential equations. For discrete-time input-output maps having limited long-term memory, we prove universal approximation guarantees for temporal convolutional nets constructed using only a finite number of computation units which hold on an infinite-time horizon. We also provide quantitative estimates for the width and depth of the network sufficient to achieve any fixed error tolerance. Furthemore, we show that discrete-time input-output maps given by state-space realizations satisfying certain stability criteria admit such convolutional net approximations which are accurate on an infinite-time scale. For continuous-time input-output maps induced by dynamical systems that are stable in a similar sense, we prove that continuous-time recurrent neural nets are capable of reproducing the original trajectories to within arbitrarily small error tolerance over an infinite-time horizon. For a subset of these stable systems, we provide quantitative estimates on the number of neurons sufficient to guarantee the desired error bound

    Photonic Delay Systems as Machine Learning Implementations

    Get PDF
    Nonlinear photonic delay systems present interesting implementation platforms for machine learning models. They can be extremely fast, offer great degrees of parallelism and potentially consume far less power than digital processors. So far they have been successfully employed for signal processing using the Reservoir Computing paradigm. In this paper we show that their range of applicability can be greatly extended if we use gradient descent with backpropagation through time on a model of the system to optimize the input encoding of such systems. We perform physical experiments that demonstrate that the obtained input encodings work well in reality, and we show that optimized systems perform significantly better than the common Reservoir Computing approach. The results presented here demonstrate that common gradient descent techniques from machine learning may well be applicable on physical neuro-inspired analog computers

    Differential Dynamic Programming for time-delayed systems

    Full text link
    Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-order approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optimization in robotic systems. Here we extend classic DDP to systems with multiple time-delays in the state. Being able to find optimal trajectories for time-delayed systems with DDP opens up the possibility to use richer models for system identification and control, including recurrent neural networks with multiple timesteps in the state. We demonstrate the algorithm on a two-tank continuous stirred tank reactor. We also demonstrate the algorithm on a recurrent neural network trained to model an inverted pendulum with position information only.Comment: 7 pages, 6 figures, conference, Decision and Control (CDC), 2016 IEEE 55th Conference o

    Random Recurrent Neural Networks Dynamics

    Full text link
    This paper is a review dealing with the study of large size random recurrent neural networks. The connection weights are selected according to a probability law and it is possible to predict the network dynamics at a macroscopic scale using an averaging principle. After a first introductory section, the section 1 reviews the various models from the points of view of the single neuron dynamics and of the global network dynamics. A summary of notations is presented, which is quite helpful for the sequel. In section 2, mean-field dynamics is developed. The probability distribution characterizing global dynamics is computed. In section 3, some applications of mean-field theory to the prediction of chaotic regime for Analog Formal Random Recurrent Neural Networks (AFRRNN) are displayed. The case of AFRRNN with an homogeneous population of neurons is studied in section 4. Then, a two-population model is studied in section 5. The occurrence of a cyclo-stationary chaos is displayed using the results of \cite{Dauce01}. In section 6, an insight of the application of mean-field theory to IF networks is given using the results of \cite{BrunelHakim99}.Comment: Review paper, 36 pages, 5 figure

    Relative entropy minimizing noisy non-linear neural network to approximate stochastic processes

    Full text link
    A method is provided for designing and training noise-driven recurrent neural networks as models of stochastic processes. The method unifies and generalizes two known separate modeling approaches, Echo State Networks (ESN) and Linear Inverse Modeling (LIM), under the common principle of relative entropy minimization. The power of the new method is demonstrated on a stochastic approximation of the El Nino phenomenon studied in climate research
    • …
    corecore