1,924 research outputs found

    Using quantum theory to reduce the complexity of input-output processes

    Full text link
    All natural things process and transform information. They receive environmental information as input, and transform it into appropriate output responses. Much of science is dedicated to building models of such systems -- algorithmic abstractions of their input-output behavior that allow us to simulate how such systems can behave in the future, conditioned on what has transpired in the past. Here, we show that classical models cannot avoid inefficiency -- storing past information that is unnecessary for correct future simulation. We construct quantum models that mitigate this waste, whenever it is physically possible to do so. This suggests that the complexity of general input-output processes depends fundamentally on what sort of information theory we use to describe them.Comment: 10 pages, 5 figure

    Predictive-State Decoders: Encoding the Future into Recurrent Networks

    Full text link
    Recurrent neural networks (RNNs) are a vital modeling technique that rely on internal states learned indirectly by optimization of a supervised, unsupervised, or reinforcement training loss. RNNs are used to model dynamic processes that are characterized by underlying latent states whose form is often unknown, precluding its analytic representation inside an RNN. In the Predictive-State Representation (PSR) literature, latent state processes are modeled by an internal state representation that directly models the distribution of future observations, and most recent work in this area has relied on explicitly representing and targeting sufficient statistics of this probability distribution. We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations. Predictive-State Decoders are simple to implement and easily incorporated into existing training pipelines via additional loss regularization. We demonstrate the effectiveness of PSDs with experimental results in three different domains: probabilistic filtering, Imitation Learning, and Reinforcement Learning. In each, our method improves statistical performance of state-of-the-art recurrent baselines and does so with fewer iterations and less data.Comment: NIPS 201

    Stochastic and adaptive systems : interim report

    Get PDF
    Includes bibliographical references.Research supported by Air Force Office of Scientific Research (AFSC), Research Grant AFOSR 77-3281. Covers time period, March 1, 1977 to February 28, 1978.by Michael Athans and Sanjoy K. Mitter

    High-Performance Distributed ML at Scale through Parameter Server Consistency Models

    Full text link
    As Machine Learning (ML) applications increase in data size and model complexity, practitioners turn to distributed clusters to satisfy the increased computational and memory demands. Unfortunately, effective use of clusters for ML requires considerable expertise in writing distributed code, while highly-abstracted frameworks like Hadoop have not, in practice, approached the performance seen in specialized ML implementations. The recent Parameter Server (PS) paradigm is a middle ground between these extremes, allowing easy conversion of single-machine parallel ML applications into distributed ones, while maintaining high throughput through relaxed "consistency models" that allow inconsistent parameter reads. However, due to insufficient theoretical study, it is not clear which of these consistency models can really ensure correct ML algorithm output; at the same time, there remain many theoretically-motivated but undiscovered opportunities to maximize computational throughput. Motivated by this challenge, we study both the theoretical guarantees and empirical behavior of iterative-convergent ML algorithms in existing PS consistency models. We then use the gleaned insights to improve a consistency model using an "eager" PS communication mechanism, and implement it as a new PS system that enables ML algorithms to reach their solution more quickly.Comment: 19 pages, 2 figure
    corecore