348 research outputs found
STEER: Simple Temporal Regularization For Neural ODEs
Training Neural Ordinary Differential Equations (ODEs) is often
computationally expensive. Indeed, computing the forward pass of such models
involves solving an ODE which can become arbitrarily complex during training.
Recent works have shown that regularizing the dynamics of the ODE can partially
alleviate this. In this paper we propose a new regularization technique:
randomly sampling the end time of the ODE during training. The proposed
regularization is simple to implement, has negligible overhead and is effective
across a wide variety of tasks. Further, the technique is orthogonal to several
other methods proposed to regularize the dynamics of ODEs and as such can be
used in conjunction with them. We show through experiments on normalizing
flows, time series models and image recognition that the proposed
regularization can significantly decrease training time and even improve
performance over baseline models.Comment: Neurips 202
Closed-form continuous-time neural networks
Continuous-time neural networks are a class of machine learning systems that can tackle representation learning on spatiotemporal decision-making tasks. These models are typically represented by continuous differential equations. However, their expressive power when they are deployed on computers is bottlenecked by numerical differential equation solvers. This limitation has notably slowed down the scaling and understanding of numerous natural physical phenomena such as the dynamics of nervous systems. Ideally, we would circumvent this bottleneck by solving the given dynamical system in closed form. This is known to be intractable in general. Here, we show that it is possible to closely approximate the interaction between neurons and synapses—the building blocks of natural and artificial neural networks—constructed by liquid time-constant networks efficiently in closed form. To this end, we compute a tightly bounded approximation of the solution of an integral appearing in liquid time-constant dynamics that has had no known closed-form solution so far. This closed-form solution impacts the design of continuous-time and continuous-depth neural models. For instance, since time appears explicitly in closed form, the formulation relaxes the need for complex numerical solvers. Consequently, we obtain models that are between one and five orders of magnitude faster in training and inference compared with differential equation-based counterparts. More importantly, in contrast to ordinary differential equation-based continuous networks, closed-form networks can scale remarkably well compared with other deep learning instances. Lastly, as these models are derived from liquid networks, they show good performance in time-series modelling compared with advanced recurrent neural network models
Closed-form Continuous-Depth Models
Continuous-depth neural models, where the derivative of the model's hidden
state is defined by a neural network, have enabled strong sequential data
processing capabilities. However, these models rely on advanced numerical
differential equation (DE) solvers resulting in a significant overhead both in
terms of computational cost and model complexity. In this paper, we present a
new family of models, termed Closed-form Continuous-depth (CfC) networks, that
are simple to describe and at least one order of magnitude faster while
exhibiting equally strong modeling abilities compared to their ODE-based
counterparts. The models are hereby derived from the analytical closed-form
solution of an expressive subset of time-continuous models, thus alleviating
the need for complex DE solvers all together. In our experimental evaluations,
we demonstrate that CfC networks outperform advanced, recurrent models over a
diverse set of time-series prediction tasks, including those with long-term
dependencies and irregularly sampled data. We believe our findings open new
opportunities to train and deploy rich, continuous neural models in
resource-constrained settings, which demand both performance and efficiency.Comment: 17 page
Deep Latent State Space Models for Time-Series Generation
Methods based on ordinary differential equations (ODEs) are widely used to
build generative models of time-series. In addition to high computational
overhead due to explicitly computing hidden states recurrence, existing
ODE-based models fall short in learning sequence data with sharp transitions -
common in many real-world systems - due to numerical challenges during
optimization. In this work, we propose LS4, a generative model for sequences
with latent variables evolving according to a state space ODE to increase
modeling capacity. Inspired by recent deep state space models (S4), we achieve
speedups by leveraging a convolutional representation of LS4 which bypasses the
explicit evaluation of hidden states. We show that LS4 significantly
outperforms previous continuous-time generative models in terms of marginal
distribution, classification, and prediction scores on real-world datasets in
the Monash Forecasting Repository, and is capable of modeling highly stochastic
data with sharp temporal transitions. LS4 sets state-of-the-art for
continuous-time latent generative models, with significant improvement of mean
squared error and tighter variational lower bounds on irregularly-sampled
datasets, while also being x100 faster than other baselines on long sequences
Vid-ODE: Continuous-Time Video Generation with Neural Ordinary Differential Equation
Video generation models often operate under the assumption of fixed frame
rates, which leads to suboptimal performance when it comes to handling flexible
frame rates (e.g., increasing the frame rate of the more dynamic portion of the
video as well as handling missing video frames). To resolve the restricted
nature of existing video generation models' ability to handle arbitrary
timesteps, we propose continuous-time video generation by combining neural ODE
(Vid-ODE) with pixel-level video processing techniques. Using ODE-ConvGRU as an
encoder, a convolutional version of the recently proposed neural ODE, which
enables us to learn continuous-time dynamics, Vid-ODE can learn the
spatio-temporal dynamics of input videos of flexible frame rates. The decoder
integrates the learned dynamics function to synthesize video frames at any
given timesteps, where the pixel-level composition technique is used to
maintain the sharpness of individual frames. With extensive experiments on four
real-world video datasets, we verify that the proposed Vid-ODE outperforms
state-of-the-art approaches under various video generation settings, both
within the trained time range (interpolation) and beyond the range
(extrapolation). To the best of our knowledge, Vid-ODE is the first work
successfully performing continuous-time video generation using real-world
videos.Comment: Accepted to AAAI 2021, 22 page
Learning Continuous Network Emerging Dynamics from Scarce Observations via Data-Adaptive Stochastic Processes
Learning network dynamics from the empirical structure and spatio-temporal
observation data is crucial to revealing the interaction mechanisms of complex
networks in a wide range of domains. However, most existing methods only aim at
learning network dynamic behaviors generated by a specific ordinary
differential equation instance, resulting in ineffectiveness for new ones, and
generally require dense observations. The observed data, especially from
network emerging dynamics, are usually difficult to obtain, which brings
trouble to model learning. Therefore, how to learn accurate network dynamics
with sparse, irregularly-sampled, partial, and noisy observations remains a
fundamental challenge. We introduce Neural ODE Processes for Network Dynamics
(NDP4ND), a new class of stochastic processes governed by stochastic
data-adaptive network dynamics, to overcome the challenge and learn continuous
network dynamics from scarce observations. Intensive experiments conducted on
various network dynamics in ecological population evolution, phototaxis
movement, brain activity, epidemic spreading, and real-world empirical systems,
demonstrate that the proposed method has excellent data adaptability and
computational efficiency, and can adapt to unseen network emerging dynamics,
producing accurate interpolation and extrapolation with reducing the ratio of
required observation data to only about 6\% and improving the learning speed
for new dynamics by three orders of magnitude.Comment: preprin
- …