2,820 research outputs found
Augmented Neural ODEs
We show that Neural Ordinary Differential Equations (ODEs) learn
representations that preserve the topology of the input space and prove that
this implies the existence of functions Neural ODEs cannot represent. To
address these limitations, we introduce Augmented Neural ODEs which, in
addition to being more expressive models, are empirically more stable,
generalize better and have a lower computational cost than Neural ODEs.Comment: NeurIPS camera ready, additional experiments, additional datasets,
discussion on relation to other model
On Second Order Behaviour in Augmented Neural ODEs.
Neural Ordinary Differential Equations (NODEs) are a new class of models that
transform data continuously through infinite-depth architectures. The
continuous nature of NODEs has made them particularly suitable for learning the
dynamics of complex physical systems. While previous work has mostly been
focused on first order ODEs, the dynamics of many systems, especially in
classical physics, are governed by second order laws. In this work, we take a
closer look at Second Order Neural ODEs (SONODEs). We show how the adjoint
sensitivity method can be extended to SONODEs and prove that an alternative
first order optimisation method is computationally more efficient. Furthermore,
we extend the theoretical understanding of the broader class of Augmented NODEs
(ANODEs) by showing they can also learn higher order dynamics, but at the cost
of interpretability. This indicates that the advantages of ANODEs go beyond the
extra space offered by the augmented dimensions, as originally thought.
Finally, we compare SONODEs and ANODEs on synthetic and real dynamical systems
and demonstrate that the inductive biases of the former generally result in
faster training and better performance.Comment: Contains 27 pages, 14 figure
Neural ODEs with stochastic vector field mixtures
It was recently shown that neural ordinary differential equation models
cannot solve fundamental and seemingly straightforward tasks even with
high-capacity vector field representations. This paper introduces two other
fundamental tasks to the set that baseline methods cannot solve, and proposes
mixtures of stochastic vector fields as a model class that is capable of
solving these essential problems. Dynamic vector field selection is of critical
importance for our model, and our approach is to propagate component
uncertainty over the integration interval with a technique based on forward
filtering. We also formalise several loss functions that encourage desirable
properties on the trajectory paths, and of particular interest are those that
directly encourage fewer expected function evaluations. Experimentally, we
demonstrate that our model class is capable of capturing the natural dynamics
of human behaviour; a notoriously volatile application area. Baseline
approaches cannot adequately model this problem
The performance of approximating ordinary differential equations by neural nets
The dynamics of many systems are described by ordinary differential equations (ODE). Solving ODEs with standard methods (i.e. numerical integration) needs a high amount of computing time but only a small amount of storage memory. For some applications, e.g. short time weather forecast or real time robot control, long computation times are prohibitive. Is there a method which uses less computing time (but has drawbacks in other aspects, e.g. memory), so that the computation of ODEs gets faster? We will try to discuss this question for the assumption that the alternative computation method is a neural network which was trained on ODE dynamics and compare both methods using the same approximation error. This comparison is done with two different errors. First, we use the standard error that measures the difference between the approximation and the solution of the ODE which is hard to characterize. But in many cases, as for physics engines used in computer games, the shape of the approximation curve is important and not the exact values of the approximation. Therefore, we introduce a subjective error based on the Total Least Square Error (TLSE) which gives more consistent results. For the final performance comparison, we calculate the optimal resource usage for the neural network and evaluate it depending on the resolution of the interpolation points and the inter-point distance. Our conclusion gives a method to evaluate where neural nets are advantageous over numerical ODE integration and where this is not the case. Index Terms—ODE, neural nets, Euler method, approximation complexity, storage optimization
- …