1,924 research outputs found
Using quantum theory to reduce the complexity of input-output processes
All natural things process and transform information. They receive
environmental information as input, and transform it into appropriate output
responses. Much of science is dedicated to building models of such systems --
algorithmic abstractions of their input-output behavior that allow us to
simulate how such systems can behave in the future, conditioned on what has
transpired in the past. Here, we show that classical models cannot avoid
inefficiency -- storing past information that is unnecessary for correct future
simulation. We construct quantum models that mitigate this waste, whenever it
is physically possible to do so. This suggests that the complexity of general
input-output processes depends fundamentally on what sort of information theory
we use to describe them.Comment: 10 pages, 5 figure
Predictive-State Decoders: Encoding the Future into Recurrent Networks
Recurrent neural networks (RNNs) are a vital modeling technique that rely on
internal states learned indirectly by optimization of a supervised,
unsupervised, or reinforcement training loss. RNNs are used to model dynamic
processes that are characterized by underlying latent states whose form is
often unknown, precluding its analytic representation inside an RNN. In the
Predictive-State Representation (PSR) literature, latent state processes are
modeled by an internal state representation that directly models the
distribution of future observations, and most recent work in this area has
relied on explicitly representing and targeting sufficient statistics of this
probability distribution. We seek to combine the advantages of RNNs and PSRs by
augmenting existing state-of-the-art recurrent neural networks with
Predictive-State Decoders (PSDs), which add supervision to the network's
internal state representation to target predicting future observations.
Predictive-State Decoders are simple to implement and easily incorporated into
existing training pipelines via additional loss regularization. We demonstrate
the effectiveness of PSDs with experimental results in three different domains:
probabilistic filtering, Imitation Learning, and Reinforcement Learning. In
each, our method improves statistical performance of state-of-the-art recurrent
baselines and does so with fewer iterations and less data.Comment: NIPS 201
Stochastic and adaptive systems : interim report
Includes bibliographical references.Research supported by Air Force Office of Scientific Research (AFSC), Research Grant AFOSR 77-3281. Covers time period, March 1, 1977 to February 28, 1978.by Michael Athans and Sanjoy K. Mitter
High-Performance Distributed ML at Scale through Parameter Server Consistency Models
As Machine Learning (ML) applications increase in data size and model
complexity, practitioners turn to distributed clusters to satisfy the increased
computational and memory demands. Unfortunately, effective use of clusters for
ML requires considerable expertise in writing distributed code, while
highly-abstracted frameworks like Hadoop have not, in practice, approached the
performance seen in specialized ML implementations. The recent Parameter Server
(PS) paradigm is a middle ground between these extremes, allowing easy
conversion of single-machine parallel ML applications into distributed ones,
while maintaining high throughput through relaxed "consistency models" that
allow inconsistent parameter reads. However, due to insufficient theoretical
study, it is not clear which of these consistency models can really ensure
correct ML algorithm output; at the same time, there remain many
theoretically-motivated but undiscovered opportunities to maximize
computational throughput. Motivated by this challenge, we study both the
theoretical guarantees and empirical behavior of iterative-convergent ML
algorithms in existing PS consistency models. We then use the gleaned insights
to improve a consistency model using an "eager" PS communication mechanism, and
implement it as a new PS system that enables ML algorithms to reach their
solution more quickly.Comment: 19 pages, 2 figure
- …