Search CORE

155 research outputs found

Reinforcement learning in large state action spaces

Author: Mahajan Anuj
Publication venue
Publication date: 07/06/2023
Field of study

Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications

Oxford University Research Archive

Multiscale Markov Decision Problems: Compression, Solution, and Transfer Learning

Author: Bouvrie Jake
Maggioni Mauro
Publication venue
Publication date: 05/12/2012
Field of study

Many problems in sequential decision making and stochastic control often have natural multiscale structure: sub-tasks are assembled together to accomplish complex goals. Systematically inferring and leveraging hierarchical structure, particularly beyond a single level of abstraction, has remained a longstanding challenge. We describe a fast multiscale procedure for repeatedly compressing, or homogenizing, Markov decision processes (MDPs), wherein a hierarchy of sub-problems at different scales is automatically determined. Coarsened MDPs are themselves independent, deterministic MDPs, and may be solved using existing algorithms. The multiscale representation delivered by this procedure decouples sub-tasks from each other and can lead to substantial improvements in convergence rates both locally within sub-problems and globally across sub-problems, yielding significant computational savings. A second fundamental aspect of this work is that these multiscale decompositions yield new transfer opportunities across different problems, where solutions of sub-tasks at different levels of the hierarchy may be amenable to transfer to new problems. Localized transfer of policies and potential operators at arbitrary scales is emphasized. Finally, we demonstrate compression and transfer in a collection of illustrative domains, including examples involving discrete and continuous statespaces.Comment: 86 pages, 15 figure

arXiv.org e-Print Archive

CiteSeerX

Certificates of quantum many-body properties assisted by machine learning

Author: Dunjko Vedran
Lewenstein Maciej
Muñoz-Gil Gorka
Requena Borja
Tura Jordi
Publication venue
Publication date: 05/03/2021
Field of study

Computationally intractable tasks are often encountered in physics and optimization. Such tasks often comprise a cost function to be optimized over a so-called feasible set, which is specified by a set of constraints. This may yield, in general, to difficult and non-convex optimization tasks. A number of standard methods are used to tackle such problems: variational approaches focus on parameterizing a subclass of solutions within the feasible set; in contrast, relaxation techniques have been proposed to approximate it from outside, thus complementing the variational approach by providing ultimate bounds to the global optimal solution. In this work, we propose a novel approach combining the power of relaxation techniques with deep reinforcement learning in order to find the best possible bounds within a limited computational budget. We illustrate the viability of the method in the context of finding the ground state energy of many-body quantum systems, a paradigmatic problem in quantum physics. We benchmark our approach against other classical optimization algorithms such as breadth-first search or Monte-Carlo, and we characterize the effect of transfer learning. We find the latter may be indicative of phase transitions, with a completely autonomous approach. Finally, we provide tools to generalize the approach to other common applications in the field of quantum information processing.Comment: 22 pages (12.5 + appendices), 8 figure

arXiv.org e-Print Archive

Control flow in active inference systems Part II: Tensor networks as general models of control flow

Author: Fabrocini Filippo
Fields Chris
Friston Karl
Glazebrook James F
Hazan Hananel
Levin Michael
Marcianò Antonino
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Living systems face both environmental complexity and limited access to free-energy resources. Survival under these conditions requires a control system that can activate, or deploy, available perception and action resources in a context specific way. In Part I, we introduced the free-energy principle (FEP) and the idea of active inference as Bayesian prediction-error minimization, and show how the control problem arises in active inference systems. We then review classical and quantum formulations of the FEP, with the former being the classical limit of the latter. In this accompanying Part II, we show that when systems are described as executing active inference driven by the FEP, their control flow systems can always be represented as tensor networks (TNs). We show how TNs as control systems can be implemented within the general framework of quantum topological neural networks, and discuss the implications of these results for modeling biological systems at multiple scales

UCL Discovery

Bayesian Methods in Tensor Analysis

Author: Shen Weining
Shi Yiyao
Publication venue
Publication date: 05/06/2023
Field of study

Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties. We also discuss potential future directions in this field.Comment: 32 pages, 8 figures, 2 table

arXiv.org e-Print Archive

Exploiting Structure for Scalable and Robust Deep Learning

Author: Zheng Stephan Tao
Publication venue
Publication date: 01/01/2018
Field of study

Deep learning has seen great success training deep neural networks for complex prediction problems, such as large-scale image recognition, short-term time-series forecasting, and learning behavioral models for games with simple dynamics. However, neural networks have a number of weaknesses: 1) they are not sample-efficient and 2) they are often not robust against (adversarial) input perturbations. Hence, it is challenging to train neural networks for problems with exponential complexity, such as multi-agent games, complex long-term spatiotemporal dynamics, or noisy high-resolution image data. This thesis contributes methods to improve the sample efficiency, expressive power, and robustness of neural networks, by exploiting various forms of low-dimensional structure, such as spatiotemporal hierarchy and multi-agent coordination. We show the effectiveness of this approach in multiple learning paradigms: in both the supervised learning (e.g., imitation learning) and reinforcement learning settings. First, we introduce hierarchical neural networks that model both short-term actions and long-term goals from data, and can learn human-level behavioral models for spatiotemporal multi-agent games, such as basketball, using imitation learning. Second, in reinforcement learning, we show that behavioral policies with a hierarchical latent structure can efficiently learn forms of multi-agent coordination, which enables a form of structured exploration for faster learning. Third, we showcase tensor-train recurrent neural networks that can model high-order mutliplicative structure in dynamical systems (e.g., Lorenz dynamics). We show that this model class gives state-of-the-art long-term forecasting performance with very long time horizons for both simulation and real-world traffic and climate data. Finally, we demonstrate two methods for neural network robustness: 1) stability training, a form of stochastic data augmentation to make neural networks more robust, and 2) neural fingerprinting, a method that detects adversarial examples by validating the network’s behavior in the neighborhood of any given input. In sum, this thesis takes a step to enable machine learning for the next scale of problem complexity, such as rich spatiotemporal multi-agent games and large-scale robust predictions.</p

Caltech Theses and Dissertations