9,915 research outputs found
A hierarchy of recurrent networks for speech recognition
Generative models for sequential data based on directed graphs of Restricted Boltzmann Machines (RBMs) are able to accurately model high dimensional sequences as recently shown. In these models, temporal dependencies in the input are discovered by either buffering previous visible variables or by recurrent connections of the hidden variables. Here we propose a modification of these models, the Temporal Reservoir Machine (TRM). It utilizes a recurrent artificial neural network (ANN) for integrating information from the input over
time. This information is then fed into a RBM at each time step. To avoid difficulties of recurrent network learning, the ANN remains untrained and hence can be thought of as a random feature extractor. Using the architecture of multi-layer RBMs (Deep Belief Networks), the TRMs can be used as a building block for complex hierarchical models. This approach unifies RBM-based approaches for sequential data modeling and the Echo State Network, a powerful approach for black-box system identification. The TRM is tested on a spoken digits task under noisy conditions, and competitive performances compared to previous models are observed
Photonic Delay Systems as Machine Learning Implementations
Nonlinear photonic delay systems present interesting implementation platforms
for machine learning models. They can be extremely fast, offer great degrees of
parallelism and potentially consume far less power than digital processors. So
far they have been successfully employed for signal processing using the
Reservoir Computing paradigm. In this paper we show that their range of
applicability can be greatly extended if we use gradient descent with
backpropagation through time on a model of the system to optimize the input
encoding of such systems. We perform physical experiments that demonstrate that
the obtained input encodings work well in reality, and we show that optimized
systems perform significantly better than the common Reservoir Computing
approach. The results presented here demonstrate that common gradient descent
techniques from machine learning may well be applicable on physical
neuro-inspired analog computers
Response Characterization for Auditing Cell Dynamics in Long Short-term Memory Networks
In this paper, we introduce a novel method to interpret recurrent neural
networks (RNNs), particularly long short-term memory networks (LSTMs) at the
cellular level. We propose a systematic pipeline for interpreting individual
hidden state dynamics within the network using response characterization
methods. The ranked contribution of individual cells to the network's output is
computed by analyzing a set of interpretable metrics of their decoupled step
and sinusoidal responses. As a result, our method is able to uniquely identify
neurons with insightful dynamics, quantify relationships between dynamical
properties and test accuracy through ablation analysis, and interpret the
impact of network capacity on a network's dynamical distribution. Finally, we
demonstrate generalizability and scalability of our method by evaluating a
series of different benchmark sequential datasets
Predictive-State Decoders: Encoding the Future into Recurrent Networks
Recurrent neural networks (RNNs) are a vital modeling technique that rely on
internal states learned indirectly by optimization of a supervised,
unsupervised, or reinforcement training loss. RNNs are used to model dynamic
processes that are characterized by underlying latent states whose form is
often unknown, precluding its analytic representation inside an RNN. In the
Predictive-State Representation (PSR) literature, latent state processes are
modeled by an internal state representation that directly models the
distribution of future observations, and most recent work in this area has
relied on explicitly representing and targeting sufficient statistics of this
probability distribution. We seek to combine the advantages of RNNs and PSRs by
augmenting existing state-of-the-art recurrent neural networks with
Predictive-State Decoders (PSDs), which add supervision to the network's
internal state representation to target predicting future observations.
Predictive-State Decoders are simple to implement and easily incorporated into
existing training pipelines via additional loss regularization. We demonstrate
the effectiveness of PSDs with experimental results in three different domains:
probabilistic filtering, Imitation Learning, and Reinforcement Learning. In
each, our method improves statistical performance of state-of-the-art recurrent
baselines and does so with fewer iterations and less data.Comment: NIPS 201
- …