9,765 research outputs found
Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network
Because of their effectiveness in broad practical applications, LSTM networks
have received a wealth of coverage in scientific journals, technical blogs, and
implementation guides. However, in most articles, the inference formulas for
the LSTM network and its parent, RNN, are stated axiomatically, while the
training formulas are omitted altogether. In addition, the technique of
"unrolling" an RNN is routinely presented without justification throughout the
literature. The goal of this paper is to explain the essential RNN and LSTM
fundamentals in a single document. Drawing from concepts in signal processing,
we formally derive the canonical RNN formulation from differential equations.
We then propose and prove a precise statement, which yields the RNN unrolling
technique. We also review the difficulties with training the standard RNN and
address them by transforming the RNN into the "Vanilla LSTM" network through a
series of logical arguments. We provide all equations pertaining to the LSTM
system together with detailed descriptions of its constituent entities. Albeit
unconventional, our choice of notation and the method for presenting the LSTM
system emphasizes ease of understanding. As part of the analysis, we identify
new opportunities to enrich the LSTM system and incorporate these extensions
into the Vanilla LSTM network, producing the most general LSTM variant to date.
The target reader has already been exposed to RNNs and LSTM networks through
numerous available resources and is open to an alternative pedagogical
approach. A Machine Learning practitioner seeking guidance for implementing our
new augmented LSTM model in software for experimentation and research will find
the insights and derivations in this tutorial valuable as well.Comment: 43 pages, 10 figures, 78 reference
Does money matter in inflation forecasting?.
This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two non-linear techniques, namely, recurrent neural networks and kernel recursive least squares regression - techniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naive random walk model. The best models were non-linear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation
Deep learning for video game playing
In this article, we review recent Deep Learning advances in the context of
how they have been applied to play different types of video games such as
first-person shooters, arcade games, and real-time strategy games. We analyze
the unique requirements that different game genres pose to a deep learning
system and highlight important open challenges in the context of applying these
machine learning methods to video games, such as general game playing, dealing
with extremely large decision spaces and sparse rewards
- …