Search CORE

27,014 research outputs found

Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness

Author: Aguirre Luis A.
Ribeiro Antônio H.
Schön Thomas B.
Tiels Koen
Publication venue
Publication date: 05/03/2020
Field of study

The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle, while powerful, might need some refinement to explain recent developments. We refine the concept of exploding gradients by reformulating the problem in terms of the cost function smoothness, which gives insight into higher-order derivatives and the existence of regions with many close local minima. We also clarify the distinction between vanishing gradients and the need for the RNN to learn attractors to fully use its expressive power. Through the lens of these refinements, we shed new light on recent developments in the RNN field, namely stable RNN and unitary (or orthogonal) RNNs.Comment: To appear in the Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020. PMLR: Volume 108. This paper was previously titled "The trade-off between long-term memory and smoothness for recurrent networks". The current version subsumes all previous version

arXiv.org e-Print Archive

Pure OAI Repository

Advances in photonic reservoir computing on an integrated platform

Author: Bienstman Peter
Dambre Joni
Fiers Martin
Schrauwen Benjamin
Van Vaerenbergh Thomas
Vandoorne Kristof
Verstraeten David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Reservoir computing is a recent approach from the fields of machine learning and artificial neural networks to solve a broad class of complex classification and recognition problems such as speech and image recognition. As is typical for methods from these fields, it involves systems that were trained based on examples, instead of using an algorithmic approach. It originated as a new training technique for recurrent neural networks where the network is split in a reservoir that does the `computation' and a simple readout function. This technique has been among the state-of-the-art. So far implementations have been mainly software based, but a hardware implementation offers the promise of being low-power and fast. We previously demonstrated with simulations that a network of coupled semiconductor optical amplifiers could also be used for this purpose on a simple classification task. This paper discusses two new developments. First of all, we identified the delay in between the nodes as the most important design parameter using an amplifier reservoir on an isolated digit recognition task and show that when optimized and combined with coherence it even yields better results than classical hyperbolic tangent reservoirs. Second we will discuss the recent advances in photonic reservoir computing with the use of resonator structures such as photonic crystal cavities and ring resonators. Using a network of resonators, feedback of the output to the network, and an appropriate learning rule, periodic signals can be generated in the optical domain. With the right parameters, these resonant structures can also exhibit spiking behaviour

Crossref

Ghent University Academic Bibliography

Recurrent backpropagation and the dynamical approach to adaptive neural computation

Author: Pineda Fernando J.
Publication venue: 'MIT Press - Journals'
Publication date: 01/06/1989
Field of study

Error backpropagation in feedforward neural network models is a popular learning algorithm that has its roots in nonlinear estimation and optimization. It is being used routinely to calculate error gradients in nonlinear systems with hundreds of thousands of parameters. However, the classical architecture for backpropagation has severe restrictions. The extension of backpropagation to networks with recurrent connections will be reviewed. It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control

Caltech Authors