9,884 research outputs found
Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data
Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used
in the machine learning and dynamical systems literature to represent complex
dynamical or sequential relationships between variables. More recently, as deep
learning models have become more common, RNNs have been used to forecast
increasingly complicated systems. Dynamical spatio-temporal processes represent
a class of complex systems that can potentially benefit from these types of
models. Although the RNN literature is expansive and highly developed,
uncertainty quantification is often ignored. Even when considered, the
uncertainty is generally quantified without the use of a rigorous framework,
such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a
more formal framework while maintaining the forecast accuracy that makes these
models appealing, by presenting a Bayesian RNN model for nonlinear
spatio-temporal forecasting. Additionally, we make simple modifications to the
basic RNN to help accommodate the unique nature of nonlinear spatio-temporal
data. The proposed model is applied to a Lorenz simulation and two real-world
nonlinear spatio-temporal forecasting applications
Predictive-State Decoders: Encoding the Future into Recurrent Networks
Recurrent neural networks (RNNs) are a vital modeling technique that rely on
internal states learned indirectly by optimization of a supervised,
unsupervised, or reinforcement training loss. RNNs are used to model dynamic
processes that are characterized by underlying latent states whose form is
often unknown, precluding its analytic representation inside an RNN. In the
Predictive-State Representation (PSR) literature, latent state processes are
modeled by an internal state representation that directly models the
distribution of future observations, and most recent work in this area has
relied on explicitly representing and targeting sufficient statistics of this
probability distribution. We seek to combine the advantages of RNNs and PSRs by
augmenting existing state-of-the-art recurrent neural networks with
Predictive-State Decoders (PSDs), which add supervision to the network's
internal state representation to target predicting future observations.
Predictive-State Decoders are simple to implement and easily incorporated into
existing training pipelines via additional loss regularization. We demonstrate
the effectiveness of PSDs with experimental results in three different domains:
probabilistic filtering, Imitation Learning, and Reinforcement Learning. In
each, our method improves statistical performance of state-of-the-art recurrent
baselines and does so with fewer iterations and less data.Comment: NIPS 201
Data-Driven Forecasting of High-Dimensional Chaotic Systems with Long Short-Term Memory Networks
We introduce a data-driven forecasting method for high-dimensional chaotic
systems using long short-term memory (LSTM) recurrent neural networks. The
proposed LSTM neural networks perform inference of high-dimensional dynamical
systems in their reduced order space and are shown to be an effective set of
nonlinear approximators of their attractor. We demonstrate the forecasting
performance of the LSTM and compare it with Gaussian processes (GPs) in time
series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation
and a prototype climate model. The LSTM networks outperform the GPs in
short-term forecasting accuracy in all applications considered. A hybrid
architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is
proposed to ensure convergence to the invariant measure. This novel hybrid
method is fully data-driven and extends the forecasting capabilities of LSTM
networks.Comment: 31 page
Training Echo State Networks with Regularization through Dimensionality Reduction
In this paper we introduce a new framework to train an Echo State Network to
predict real valued time-series. The method consists in projecting the output
of the internal layer of the network on a space with lower dimensionality,
before training the output layer to learn the target task. Notably, we enforce
a regularization constraint that leads to better generalization capabilities.
We evaluate the performances of our approach on several benchmark tests, using
different techniques to train the readout of the network, achieving superior
predictive performance when using the proposed framework. Finally, we provide
an insight on the effectiveness of the implemented mechanics through a
visualization of the trajectory in the phase space and relying on the
methodologies of nonlinear time-series analysis. By applying our method on well
known chaotic systems, we provide evidence that the lower dimensional embedding
retains the dynamical properties of the underlying system better than the
full-dimensional internal states of the network
- …