Search CORE

9,884 research outputs found

Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data

Author: McDermott Patrick L.
Wikle Christopher K.
Publication venue
Publication date: 06/02/2018
Field of study

Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. More recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data. The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Predictive-State Decoders: Encoding the Future into Recurrent Networks

Author: Venkatraman Arun
Rhinehart Nicholas
Sun Wen
Pinto Lerrel
Hebert Martial
Boots Byron
Kitani Kris M.
Bagnell J. Andrew
Publication venue
Publication date: 05/07/2017
Field of study

Recurrent neural networks (RNNs) are a vital modeling technique that rely on internal states learned indirectly by optimization of a supervised, unsupervised, or reinforcement training loss. RNNs are used to model dynamic processes that are characterized by underlying latent states whose form is often unknown, precluding its analytic representation inside an RNN. In the Predictive-State Representation (PSR) literature, latent state processes are modeled by an internal state representation that directly models the distribution of future observations, and most recent work in this area has relied on explicitly representing and targeting sufficient statistics of this probability distribution. We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations. Predictive-State Decoders are simple to implement and easily incorporated into existing training pipelines via additional loss regularization. We demonstrate the effectiveness of PSDs with experimental results in three different domains: probabilistic filtering, Imitation Learning, and Reinforcement Learning. In each, our method improves statistical performance of state-of-the-art recurrent baselines and does so with fewer iterations and less data.Comment: NIPS 201

arXiv.org e-Print Archive

Dryad Digital Repository (Duke University)

FigShare

Data-Driven Forecasting of High-Dimensional Chaotic Systems with Long Short-Term Memory Networks

Author: Byeon Wonmin
Koumoutsakos Petros
Sapsis Themistoklis P.
Vlachas Pantelis R.
Wan Zhong Y.
Publication venue: 'The Royal Society'
Publication date: 01/05/2018
Field of study

We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.Comment: 31 page

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Training Echo State Networks with Regularization through Dimensionality Reduction

Author: Bianchi Filippo Maria
Jenssen Robert
Løkse Sigurd
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/08/2016
Field of study

In this paper we introduce a new framework to train an Echo State Network to predict real valued time-series. The method consists in projecting the output of the internal layer of the network on a space with lower dimensionality, before training the output layer to learn the target task. Notably, we enforce a regularization constraint that leads to better generalization capabilities. We evaluate the performances of our approach on several benchmark tests, using different techniques to train the readout of the network, achieving superior predictive performance when using the proposed framework. Finally, we provide an insight on the effectiveness of the implemented mechanics through a visualization of the trajectory in the phase space and relying on the methodologies of nonlinear time-series analysis. By applying our method on well known chaotic systems, we provide evidence that the lower dimensional embedding retains the dynamical properties of the underlying system better than the full-dimensional internal states of the network

arXiv.org e-Print Archive

Crossref

Munin - Open Research Archive

NORA - Norwegian Open Research Archives