9,754 research outputs found
Conversion of Artificial Recurrent Neural Networks to Spiking Neural Networks for Low-power Neuromorphic Hardware
In recent years the field of neuromorphic low-power systems that consume
orders of magnitude less power gained significant momentum. However, their
wider use is still hindered by the lack of algorithms that can harness the
strengths of such architectures. While neuromorphic adaptations of
representation learning algorithms are now emerging, efficient processing of
temporal sequences or variable length-inputs remain difficult. Recurrent neural
networks (RNN) are widely used in machine learning to solve a variety of
sequence learning tasks. In this work we present a train-and-constrain
methodology that enables the mapping of machine learned (Elman) RNNs on a
substrate of spiking neurons, while being compatible with the capabilities of
current and near-future neuromorphic systems. This "train-and-constrain" method
consists of first training RNNs using backpropagation through time, then
discretizing the weights and finally converting them to spiking RNNs by
matching the responses of artificial neurons with those of the spiking neurons.
We demonstrate our approach by mapping a natural language processing task
(question classification), where we demonstrate the entire mapping process of
the recurrent layer of the network on IBM's Neurosynaptic System "TrueNorth", a
spike-based digital neuromorphic hardware architecture. TrueNorth imposes
specific constraints on connectivity, neural and synaptic parameters. To
satisfy these constraints, it was necessary to discretize the synaptic weights
and neural activities to 16 levels, and to limit fan-in to 64 inputs. We find
that short synaptic delays are sufficient to implement the dynamical (temporal)
aspect of the RNN in the question classification task. The hardware-constrained
model achieved 74% accuracy in question classification while using less than
0.025% of the cores on one TrueNorth chip, resulting in an estimated power
consumption of ~17 uW
Accelerating Eulerian Fluid Simulation With Convolutional Networks
Efficient simulation of the Navier-Stokes equations for fluid flow is a long
standing problem in applied mathematics, for which state-of-the-art methods
require large compute resources. In this work, we propose a data-driven
approach that leverages the approximation power of deep-learning with the
precision of standard solvers to obtain fast and highly realistic simulations.
Our method solves the incompressible Euler equations using the standard
operator splitting method, in which a large sparse linear system with many free
parameters must be solved. We use a Convolutional Network with a highly
tailored architecture, trained using a novel unsupervised learning framework to
solve the linear system. We present real-time 2D and 3D simulations that
outperform recently proposed data-driven methods; the obtained results are
realistic and show good generalization properties.Comment: Significant revisio
Reduced order modeling of fluid flows: Machine learning, Kolmogorov barrier, closure modeling, and partitioning
In this paper, we put forth a long short-term memory (LSTM) nudging framework
for the enhancement of reduced order models (ROMs) of fluid flows utilizing
noisy measurements. We build on the fact that in a realistic application, there
are uncertainties in initial conditions, boundary conditions, model parameters,
and/or field measurements. Moreover, conventional nonlinear ROMs based on
Galerkin projection (GROMs) suffer from imperfection and solution instabilities
due to the modal truncation, especially for advection-dominated flows with slow
decay in the Kolmogorov width. In the presented LSTM-Nudge approach, we fuse
forecasts from a combination of imperfect GROM and uncertain state estimates,
with sparse Eulerian sensor measurements to provide more reliable predictions
in a dynamical data assimilation framework. We illustrate the idea with the
viscous Burgers problem, as a benchmark test bed with quadratic nonlinearity
and Laplacian dissipation. We investigate the effects of measurements noise and
state estimate uncertainty on the performance of the LSTM-Nudge behavior. We
also demonstrate that it can sufficiently handle different levels of temporal
and spatial measurement sparsity. This first step in our assessment of the
proposed model shows that the LSTM nudging could represent a viable realtime
predictive tool in emerging digital twin systems
Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network
Because of their effectiveness in broad practical applications, LSTM networks
have received a wealth of coverage in scientific journals, technical blogs, and
implementation guides. However, in most articles, the inference formulas for
the LSTM network and its parent, RNN, are stated axiomatically, while the
training formulas are omitted altogether. In addition, the technique of
"unrolling" an RNN is routinely presented without justification throughout the
literature. The goal of this paper is to explain the essential RNN and LSTM
fundamentals in a single document. Drawing from concepts in signal processing,
we formally derive the canonical RNN formulation from differential equations.
We then propose and prove a precise statement, which yields the RNN unrolling
technique. We also review the difficulties with training the standard RNN and
address them by transforming the RNN into the "Vanilla LSTM" network through a
series of logical arguments. We provide all equations pertaining to the LSTM
system together with detailed descriptions of its constituent entities. Albeit
unconventional, our choice of notation and the method for presenting the LSTM
system emphasizes ease of understanding. As part of the analysis, we identify
new opportunities to enrich the LSTM system and incorporate these extensions
into the Vanilla LSTM network, producing the most general LSTM variant to date.
The target reader has already been exposed to RNNs and LSTM networks through
numerous available resources and is open to an alternative pedagogical
approach. A Machine Learning practitioner seeking guidance for implementing our
new augmented LSTM model in software for experimentation and research will find
the insights and derivations in this tutorial valuable as well.Comment: 43 pages, 10 figures, 78 reference
- …