27,014 research outputs found
Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness
The exploding and vanishing gradient problem has been the major conceptual
principle behind most architecture and training improvements in recurrent
neural networks (RNNs) during the last decade. In this paper, we argue that
this principle, while powerful, might need some refinement to explain recent
developments. We refine the concept of exploding gradients by reformulating the
problem in terms of the cost function smoothness, which gives insight into
higher-order derivatives and the existence of regions with many close local
minima. We also clarify the distinction between vanishing gradients and the
need for the RNN to learn attractors to fully use its expressive power. Through
the lens of these refinements, we shed new light on recent developments in the
RNN field, namely stable RNN and unitary (or orthogonal) RNNs.Comment: To appear in the Proceedings of the 23rd International Conference on
Artificial Intelligence and Statistics (AISTATS), 2020. PMLR: Volume 108.
This paper was previously titled "The trade-off between long-term memory and
smoothness for recurrent networks". The current version subsumes all previous
version
Advances in photonic reservoir computing on an integrated platform
Reservoir computing is a recent approach from the fields of machine learning and artificial neural networks to solve a broad class of complex classification and recognition problems such as speech and image recognition. As is typical for methods from these fields, it involves systems that were trained based on examples, instead of using an algorithmic approach. It originated as a new training technique for recurrent neural networks where the network is split in a reservoir that does the `computation' and a simple readout function. This technique has been among the state-of-the-art. So far implementations have been mainly software based, but a hardware implementation offers the promise of being low-power and fast. We previously demonstrated with simulations that a network of coupled semiconductor optical amplifiers could also be used for this purpose on a simple classification task. This paper discusses two new developments. First of all, we identified the delay in between the nodes as the most important design parameter using an amplifier reservoir on an isolated digit recognition task and show that when optimized and combined with coherence it even yields better results than classical hyperbolic tangent reservoirs. Second we will discuss the recent advances in photonic reservoir computing with the use of resonator structures such as photonic crystal cavities and ring resonators. Using a network of resonators, feedback of the output to the network, and an appropriate learning rule, periodic signals can be generated in the optical domain. With the right parameters, these resonant structures can also exhibit spiking behaviour
Recurrent backpropagation and the dynamical approach to adaptive neural computation
Error backpropagation in feedforward neural network models is a popular learning algorithm that has its roots in nonlinear estimation and optimization. It is being used routinely to calculate error gradients in nonlinear systems with hundreds of thousands of parameters. However, the classical architecture for backpropagation has severe restrictions. The extension of backpropagation to networks with recurrent connections will be reviewed. It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control
- …