4,139 research outputs found

    Many-agent Reinforcement Learning

    Get PDF
    Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (N2N \gg 2), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm -- αα\alpha^\alpha-Rank -- in many-agent systems. The critical advantage of αα\alpha^\alpha-Rank is that it can compute the solution concept of α\alpha-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be PPADPPAD-hard in even two-player cases. αα\alpha^\alpha-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

    Adaptive reinforcement learning for heterogeneous network selection

    Get PDF
    Next generation 5G mobile wireless networks will consist of multiple technologies for devices to access the network at the edge. One of the keys to 5G is therefore the ability for device to intelligently select its Radio Access Technology (RAT). Current fully distributed algorithms for RAT selection although guaranteeing convergence to equilibrium states, are often slow, require high exploration times and may converge to undesirable equilibria. In this dissertation, we propose three novel reinforcement learning (RL) frameworks to improve the efficiency of existing distributed RAT selection algorithms in a heterogeneous environment, where users may potentially apply a number of different RAT selection procedures. Although our research focuses on solutions for RAT selection in the current and future mobile wireless networks, the proposed solutions in this dissertation are general and suitable to apply for any large scale distributed multi-agent systems. In the first framework, called RL with Non-positive Regret, we propose a novel adaptive RL for multi-agent non-cooperative repeated games. The main contribution is to use both positive and negative regrets in RL to improve the convergence speed and fairness of the well-known regret-based RL procedure. Significant improvements in performance compared to other related algorithms in the literature are demonstrated. In the second framework, called RL with Network-Assisted Feedback (RLNF), our core contribution is to develop a network feedback model that uses network-assisted information to improve the performance of the distributed RL for RAT selection. RLNF guarantees no-regret payoff in the long-run for any user adopting it, regardless of what other users might do and so can work in an environment where not all users use the same learning strategy. This is an important implementation advantage as RLNF can be implemented within current mobile network standards. In the third framework, we propose a novel adaptive RL-based mechanism for RAT selection that can effectively handle user mobility. The key contribution is to leverage forgetting methods to rapidly react to the changes in the radio conditions when users move. We show that our solution improves the performance of wireless networks and converges much faster when users move compared to the non-adaptive solutions. Another objective of the research is to study the impact of various network models on the performance of different RAT selection approaches. We propose a unified benchmark to compare the performances of different algorithms under the same computational environment. The comparative studies reveal that among all the important network parameters that influence the performance of RAT selection algorithms, the number of base stations that a user can connect to has the most significant impact. This finding provides some guidelines for the proper design of RAT selection algorithms for future 5G. Our evaluation benchmark can serve as a reference for researchers, network developers, and engineers. Overall, the thesis provides different reinforcement learning frameworks to improve the efficiency of current fully distributed algorithms for heterogeneous RAT selection. We prove the convergence of the proposed reinforcement learning procedures using the differential inclusion (DI) technique. The theoretical analyses demonstrate that the use of DI not only provides an effective method to study the convergence properties of adaptive procedures in game-theoretic learning, but also yields a much more concise and extensible proof as compared to the classical approaches.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201

    Machine Learning for Fluid Mechanics

    Full text link
    The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from field measurements, experiments and large-scale simulations at multiple spatiotemporal scales. Machine learning offers a wealth of techniques to extract information from data that could be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. It outlines fundamental machine learning methodologies and discusses their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experimentation, and simulation. Machine learning provides a powerful information processing framework that can enrich, and possibly even transform, current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202

    Special Topics in Information Technology

    Get PDF
    This open access book presents thirteen outstanding doctoral dissertations in Information Technology from the Department of Electronics, Information and Bioengineering, Politecnico di Milano, Italy. Information Technology has always been highly interdisciplinary, as many aspects have to be considered in IT systems. The doctoral studies program in IT at Politecnico di Milano emphasizes this interdisciplinary nature, which is becoming more and more important in recent technological advances, in collaborative projects, and in the education of young researchers. Accordingly, the focus of advanced research is on pursuing a rigorous approach to specific research topics starting from a broad background in various areas of Information Technology, especially Computer Science and Engineering, Electronics, Systems and Control, and Telecommunications. Each year, more than 50 PhDs graduate from the program. This book gathers the outcomes of the thirteen best theses defended in 2019-20 and selected for the IT PhD Award. Each of the authors provides a chapter summarizing his/her findings, including an introduction, description of methods, main achievements and future work on the topic. Hence, the book provides a cutting-edge overview of the latest research trends in Information Technology at Politecnico di Milano, presented in an easy-to-read format that will also appeal to non-specialists

    VI Workshop on Computational Data Analysis and Numerical Methods: Book of Abstracts

    Get PDF
    The VI Workshop on Computational Data Analysis and Numerical Methods (WCDANM) is going to be held on June 27-29, 2019, in the Department of Mathematics of the University of Beira Interior (UBI), Covilhã, Portugal and it is a unique opportunity to disseminate scientific research related to the areas of Mathematics in general, with particular relevance to the areas of Computational Data Analysis and Numerical Methods in theoretical and/or practical field, using new techniques, giving especial emphasis to applications in Medicine, Biology, Biotechnology, Engineering, Industry, Environmental Sciences, Finance, Insurance, Management and Administration. The meeting will provide a forum for discussion and debate of ideas with interest to the scientific community in general. With this meeting new scientific collaborations among colleagues, namely new collaborations in Masters and PhD projects are expected. The event is open to the entire scientific community (with or without communication/poster)

    Computational Physics on Graphics Processing Units

    Full text link
    The use of graphics processing units for scientific computations is an emerging strategy that can significantly speed up various different algorithms. In this review, we discuss advances made in the field of computational physics, focusing on classical molecular dynamics, and on quantum simulations for electronic structure calculations using the density functional theory, wave function techniques, and quantum field theory.Comment: Proceedings of the 11th International Conference, PARA 2012, Helsinki, Finland, June 10-13, 201
    corecore