27,014 research outputs found

    Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness

    Get PDF
    The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle, while powerful, might need some refinement to explain recent developments. We refine the concept of exploding gradients by reformulating the problem in terms of the cost function smoothness, which gives insight into higher-order derivatives and the existence of regions with many close local minima. We also clarify the distinction between vanishing gradients and the need for the RNN to learn attractors to fully use its expressive power. Through the lens of these refinements, we shed new light on recent developments in the RNN field, namely stable RNN and unitary (or orthogonal) RNNs.Comment: To appear in the Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS), 2020. PMLR: Volume 108. This paper was previously titled "The trade-off between long-term memory and smoothness for recurrent networks". The current version subsumes all previous version

    Advances in photonic reservoir computing on an integrated platform

    Get PDF
    Reservoir computing is a recent approach from the fields of machine learning and artificial neural networks to solve a broad class of complex classification and recognition problems such as speech and image recognition. As is typical for methods from these fields, it involves systems that were trained based on examples, instead of using an algorithmic approach. It originated as a new training technique for recurrent neural networks where the network is split in a reservoir that does the `computation' and a simple readout function. This technique has been among the state-of-the-art. So far implementations have been mainly software based, but a hardware implementation offers the promise of being low-power and fast. We previously demonstrated with simulations that a network of coupled semiconductor optical amplifiers could also be used for this purpose on a simple classification task. This paper discusses two new developments. First of all, we identified the delay in between the nodes as the most important design parameter using an amplifier reservoir on an isolated digit recognition task and show that when optimized and combined with coherence it even yields better results than classical hyperbolic tangent reservoirs. Second we will discuss the recent advances in photonic reservoir computing with the use of resonator structures such as photonic crystal cavities and ring resonators. Using a network of resonators, feedback of the output to the network, and an appropriate learning rule, periodic signals can be generated in the optical domain. With the right parameters, these resonant structures can also exhibit spiking behaviour

    Recurrent backpropagation and the dynamical approach to adaptive neural computation

    Get PDF
    Error backpropagation in feedforward neural network models is a popular learning algorithm that has its roots in nonlinear estimation and optimization. It is being used routinely to calculate error gradients in nonlinear systems with hundreds of thousands of parameters. However, the classical architecture for backpropagation has severe restrictions. The extension of backpropagation to networks with recurrent connections will be reviewed. It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control
    • …
    corecore