Search CORE

333 research outputs found

Short-term Memory of Deep RNN

Author: Gallicchio Claudio
Publication venue
Publication date: 01/01/2018
Field of study

The extension of deep learning towards temporal data processing is gaining an increasing research interest. In this paper we investigate the properties of state dynamics developed in successive levels of deep recurrent neural networks (RNNs) in terms of short-term memory abilities. Our results reveal interesting insights that shed light on the nature of layering as a factor of RNN design. Noticeably, higher layers in a hierarchically organized RNN architecture results to be inherently biased towards longer memory spans even prior to training of the recurrent connections. Moreover, in the context of Reservoir Computing framework, our analysis also points out the benefit of a layered recurrent organization as an efficient approach to improve the memory skills of reservoir models.Comment: This is a pre-print (pre-review) version of the paper accepted for presentation at the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges (Belgium), 25-27 April 201

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Echo State Networks with Self-Normalizing Activations on the Hyper-Sphere

Author: Alippi Cesare
Livi Lorenzo
Verzelli Pietro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Among the various architectures of Recurrent Neural Networks, Echo State Networks (ESNs) emerged due to their simplified and inexpensive training procedure. These networks are known to be sensitive to the setting of hyper-parameters, which critically affect their behaviour. Results show that their performance is usually maximized in a narrow region of hyper-parameter space called edge of chaos. Finding such a region requires searching in hyper-parameter space in a sensible way: hyper-parameter configurations marginally outside such a region might yield networks exhibiting fully developed chaos, hence producing unreliable computations. The performance gain due to optimizing hyper-parameters can be studied by considering the memory--nonlinearity trade-off, i.e., the fact that increasing the nonlinear behavior of the network degrades its ability to remember past inputs, and vice-versa. In this paper, we propose a model of ESNs that eliminates critical dependence on hyper-parameters, resulting in networks that provably cannot enter a chaotic regime and, at the same time, denotes nonlinear behaviour in phase space characterised by a large memory of past inputs, comparable to the one of linear networks. Our contribution is supported by experiments corroborating our theoretical findings, showing that the proposed model displays dynamics that are rich-enough to approximate many common nonlinear systems used for benchmarking

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

Open Research Exeter

Modelling spatiotemporal turbulent dynamics with the convolutional autoencoder echo state network

Author: Doan Nguyen Anh Khoa
Magri Luca
Racca Alberto
Publication venue
Publication date: 22/11/2022
Field of study

The spatiotemporal dynamics of turbulent flows is chaotic and difficult to predict. This makes the design of accurate and stable reduced-order models challenging. The overarching objective of this paper is to propose a nonlinear decomposition of the turbulent state for a reduced-order representation of the dynamics. We divide the turbulent flow into a spatial problem and a temporal problem. First, we compute the latent space, which is the manifold onto which the turbulent dynamics live (i.e., it is a numerical approximation of the turbulent attractor). The latent space is found by a series of nonlinear filtering operations, which are performed by a convolutional autoencoder (CAE). The CAE provides the decomposition in space. Second, we predict the time evolution of the turbulent state in the latent space, which is performed by an echo state network (ESN). The ESN provides the decomposition in time. Third, by assembling the CAE and the ESN, we obtain an autonomous dynamical system: the convolutional autoncoder echo state network (CAE-ESN). This is the reduced-order model of the turbulent flow. We test the CAE-ESN on a two-dimensional flow. We show that, after training, the CAE-ESN (i) finds a latent-space representation of the turbulent flow that has less than 1% of the degrees of freedom than the physical space; (ii) time-accurately and statistically predicts the flow in both quasiperiodic and turbulent regimes; (iii) is robust for different flow regimes (Reynolds numbers); and (iv) takes less than 1% of computational time to predict the turbulent flow than solving the governing equations. This work opens up new possibilities for nonlinear decompositions and reduced-order modelling of turbulent flows from data

arXiv.org e-Print Archive

Reconstruction, forecasting, and stability of chaotic dynamics from partial data

Author: Magri Luca
Margazoglou Georgios
Özalp Elise
Publication venue
Publication date: 24/05/2023
Field of study

The forecasting and computation of the stability of chaotic systems from partial observations are tasks for which traditional equation-based methods may not be suitable. In this computational paper, we propose data-driven methods to (i) infer the dynamics of unobserved (hidden) chaotic variables (full-state reconstruction); (ii) time forecast the evolution of the full state; and (iii) infer the stability properties of the full state. The tasks are performed with long short-term memory (LSTM) networks, which are trained with observations (data) limited to only part of the state: (i) the low-to-high resolution LSTM (LH-LSTM), which takes partial observations as training input, and requires access to the full system state when computing the loss; and (ii) the physics-informed LSTM (PI-LSTM), which is designed to combine partial observations with the integral formulation of the dynamical system's evolution equations. First, we derive the Jacobian of the LSTMs. Second, we analyse a chaotic partial differential equation, the Kuramoto-Sivashinsky (KS), and the Lorenz-96 system. We show that the proposed networks can forecast the hidden variables, both time-accurately and statistically. The Lyapunov exponents and covariant Lyapunov vectors, which characterize the stability of the chaotic attractors, are correctly inferred from partial observations. Third, the PI-LSTM outperforms the LH-LSTM by successfully reconstructing the hidden chaotic dynamics when the input dimension is smaller or similar to the Kaplan-Yorke dimension of the attractor. This work opens new opportunities for reconstructing the full state, inferring hidden variables, and computing the stability of chaotic systems from partial data

arXiv.org e-Print Archive

Reconstruction and Parameter Estimation of Dynamical Systems using Neural Networks

Author
Publication venue
Publication date
Field of study

Dynamical systems can be loosely regarded as systems whose dynamics is entirely determined by en evolution function and an initial condition, being therefore completely deterministic and a priori predictable. Nevertheless, their phenomenology is surprisingly rich, including intriguing phenomena such as chaotic dynamics, fractal dimensions and entropy production. In Climate Science for example, the emergence of chaos forbids us to have meteorological forecasts going beyond fourteen days in the future in the current epoch and therefore building predictive systems that overcome this limitation, at least partially, are of the extreme importance since we live in fast-changing climate world, as proven by the recent not-so-extreme-anymore climate phenomena. At the same time, Machine Learning techniques have been widely applied to practically every field of human knowledge starting from approximately ten years ago, when essentially two factors contributed to the so-called rebirth of Deep Learning: the availability of larger datasets, putting us in the era of Big Data, and the improvement of computational power. However, the possibility to apply Neural Networks to chaotic systems have been widely debated, since these models are very data hungry and rely thus on the availability of large datasets, whereas often Climate data are rare and sparse. Moreover, chaotic dynamics should not rely much on past statistics, which these models are built on. In this thesis, we explore the possibility to study dynamical systems, seen as simple proxies of Climate models, by using Neural Networks, possibly adding prior knowledge on the underlying physical processes in the spirit of Physics Informed Neural Networks, aiming to the reconstruction of the Weather (short term dynamics) and Climate (long term dynamics) of these dynamical systems as well as the estimation of unknown parameters from Data.Dynamical systems can be loosely regarded as systems whose dynamics is entirely determined by en evolution function and an initial condition, being therefore completely deterministic and a priori predictable. Nevertheless, their phenomenology is surprisingly rich, including intriguing phenomena such as chaotic dynamics, fractal dimensions and entropy production. In Climate Science for example, the emergence of chaos forbids us to have meteorological forecasts going beyond fourteen days in the future in the current epoch and therefore building predictive systems that overcome this limitation, at least partially, are of the extreme importance since we live in fast-changing climate world, as proven by the recent not-so-extreme-anymore climate phenomena. At the same time, Machine Learning techniques have been widely applied to practically every field of human knowledge starting from approximately ten years ago, when essentially two factors contributed to the so-called rebirth of Deep Learning: the availability of larger datasets, putting us in the era of Big Data, and the improvement of computational power. However, the possibility to apply Neural Networks to chaotic systems have been widely debated, since these models are very data hungry and rely thus on the availability of large datasets, whereas often Climate data are rare and sparse. Moreover, chaotic dynamics should not rely much on past statistics, which these models are built on. In this thesis, we explore the possibility to study dynamical systems, seen as simple proxies of Climate models, by using Neural Networks, possibly adding prior knowledge on the underlying physical processes in the spirit of Physics Informed Neural Networks, aiming to the reconstruction of the Weather (short term dynamics) and Climate (long term dynamics) of these dynamical systems as well as the estimation of unknown parameters from Data

Padua Thesis and Dissertation Archive