Search CORE

170 research outputs found

Applications of Probabilistic Inference to Planning & Reinforcement Learning

Author: Furmston TJ
Publication venue: UCL (University College London)
Publication date: 28/04/2013
Field of study

Optimal control is a profound and fascinating subject that regularly attracts interest from numerous scien- tific disciplines, including both pure and applied Mathematics, Computer Science, Artificial Intelligence, Psychology, Neuroscience and Economics. In 1960 Rudolf Kalman discovered that there exists a dual- ity between the problems of filtering and optimal control in linear systems [84]. This is now regarded as a seminal piece of work and it has since motivated a large amount of research into the discovery of similar dualities between optimal control and statistical inference. This is especially true of recent years where there has been much research into recasting problems of optimal control into problems of statis- tical/approximate inference. Broadly speaking this is the perspective that we take in this work and in particular we present various applications of methods from the fields of statistical/approximate inference to optimal control, planning and Reinforcement Learning. Some of the methods would be more accu- rately described to originate from other fields of research, such as the dual decomposition techniques used in chapter(5) which originate from convex optimisation. However, the original motivation for the application of these techniques was from the field of approximate inference. The study of dualities be- tween optimal control and statistical inference has been a subject of research for over 50 years and we do not claim to encompass the entire subject. Instead, we present what we consider to be a range of interesting and novel applications from this field of researc

UCL Discovery

Many-agent Reinforcement Learning

Author: Yang Yaodong
Publication venue: UCL (University College London)
Publication date: 28/03/2021
Field of study

Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (

N \gg 2

), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm --

\alpha^\alpha

-Rank -- in many-agent systems. The critical advantage of

\alpha^\alpha

-Rank is that it can compute the solution concept of

\alpha

-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be

PPAD

-hard in even two-player cases.

\alpha^\alpha

-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

UCL Discovery

Machine learning for optical fiber communication systems: An introduction and overview

Author: Faruk Md Saifuddin
Li Xiang
Nallaperuma Sam
Nevin Josh W
Savory Seb J
Shevchenko Nikita A
Publication venue: APL PHOTONICS
Publication date: 01/12/2021
Field of study

Optical networks generate a vast amount of diagnostic, control and performance monitoring data. When information is extracted from this data, reconfigurable network elements and reconfigurable transceivers allow the network to adapt both to changes in the physical infrastructure but also changing traffic conditions. Machine learning is emerging as a disruptive technology for extracting useful information from this raw data to enable enhanced planning, monitoring and dynamic control. We provide a survey of the recent literature and highlight numerous promising avenues for machine learning applied to optical networks, including explainable machine learning, digital twins and approaches in which we embed our knowledge into the machine learning such as physics-informed machine learning for the physical layer and graph-based machine learning for the networking layer

UCL Discovery

Apollo (Cambridge)

Anwendung von maschinellem Lernen in der optischen Nachrichtenübertragungstechnik

Author: Koch Rebekka
Publication venue
Publication date: 01/01/2022
Field of study

Aufgrund des zunehmenden Datenverkehrs wird erwartet, dass die optischen Netze zukünftig mit höheren Systemkapazitäten betrieben werden. Dazu wird bspw. die kohärente Übertragung eingesetzt, bei der das Modulationsformat erhöht werden kann, erforder jedoch ein größeres SNR. Um dies zu erreichen, wird die optische Signalleistung erhöht, wodurch die Datenübertragung durch die nichtlinearen Beeinträchtigungen gestört wird. Der Schwerpunkt dieser Arbeit liegt auf der Entwicklung von Modellen des maschinellen Lernens, die auf diese nichtlineare Signalverschlechterung reagieren. Es wird die Support-Vector-Machine (SVM) implementiert und als klassifizierende Entscheidungsmaschine verwendet. Die Ergebnisse zeigen, dass die SVM eine verbesserte Kompensation sowohl der nichtlinearen Fasereffekte als auch der Verzerrungen der optischen Systemkomponenten ermöglicht. Das Prinzip von EONs bietet eine Technologie zur effizienten Nutzung der verfügbaren Ressourcen, die von der optischen Faser bereitgestellt werden. Ein Schlüsselelement der Technologie ist der bandbreitenvariable Transponder, der bspw. die Anpassung des Modulationsformats oder des Codierungsschemas an die aktuellen Verbindungsbedingungen ermöglicht. Um eine optimale Ressourcenauslastung zu gewährleisten wird der Einsatz von Algorithmen des Reinforcement Learnings untersucht. Die Ergebnisse zeigen, dass der RL-Algorithmus in der Lage ist, sich an unbekannte Link-Bedingungen anzupassen, während vergleichbare heuristische Ansätze wie der genetische Algorithmus für jedes Szenario neu trainiert werden müssen

MACAU: Open Access Repository of Kiel University

Interactive Decision Analysis; Proceedings of an International Workshop on Interactive Decision Analysis and Interpretative Computer Intelligence, Laxenburg, Austria, September 20-23, 1983

Author: Grauer M.
Wierzbicki A.P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1984
Field of study

An International Workshop on Interactive Decision Analysis and Interpretative Computer Intelligence was held at IIASA in September 1983. The Workshop was motivated, firstly, by the realization that the rapid development of computers, especially microcomputers, will greatly increase the scope and capabilities of computerized decision-support systems. It is important to explore the potential of these systems for use in handling the complex technological, environmental, economic and social problems that face the world today. Research in decision-support systems also has another, less tangible but possibly more important, motivation. The development of efficient systems for decision support requires a thorough understanding of the differences between the decision-making processes in different nations and cultures. An understanding of the different rationales underlying decision making is not only necessary for the development of efficient decision-support systems, but it is also an important factor in encouraging international understanding and cooperation. The Proceedings of the Workshop which are contained in this volume are divided in four main sections. The first section consists of an introductory lecture in which a unifying approach to the use of computers and computerized mathematical models for decision analysis and support is described. The second section is concerned with approaches and concepts in interactive decision analysis and section three is devoted to methods and techniques for decision analysis. The final section contains descriptions of a wide range of applications of interactive techniques, covering the fields of economics, public policy planning, energy policy evaluation, hydrology and industrial development

International Institute for Applied Systems Analysis (IIASA)

Simulation Intelligence: Towards a New Generation of Scientific Methods

Author: Anandkumar Anima
Assefa Samuel
Baydin Atılım Güneş
Brehmer Johann
Choudry Sanjay
Cranmer Kyle
Gottschlich Justin
Hanuka Adi
Isayev Olexandr
Krakauer David
Lavin Alexander
Macke Jakob
Mattson Tim
McMahon Peter L.
Paige Brooks
Peterson Erik
Pfeffer Avi
Prunkl Carina
Rocki Kamil
Veloso Manuela
Wainwright Haruko
Zenil Hector
Zhang Jiaxin
Zheng Stephan
Publication venue
Publication date: 27/11/2022
Field of study

The original "Seven Motifs" set forth a roadmap of essential methods for the field of scientific computing, where a motif is an algorithmic method that captures a pattern of computation and data movement. We present the "Nine Motifs of Simulation Intelligence", a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and artificial intelligence. We call this merger simulation intelligence (SI), for short. We argue the motifs of simulation intelligence are interconnected and interdependent, much like the components within the layers of an operating system. Using this metaphor, we explore the nature of each layer of the simulation intelligence operating system stack (SI-stack) and the motifs therein: (1) Multi-physics and multi-scale modeling; (2) Surrogate modeling and emulation; (3) Simulation-based inference; (4) Causal modeling and inference; (5) Agent-based modeling; (6) Probabilistic programming; (7) Differentiable programming; (8) Open-ended optimization; (9) Machine programming. We believe coordinated efforts between motifs offers immense opportunity to accelerate scientific discovery, from solving inverse problems in synthetic biology and climate science, to directing nuclear energy experiments and predicting emergent behavior in socioeconomic settings. We elaborate on each layer of the SI-stack, detailing the state-of-art methods, presenting examples to highlight challenges and opportunities, and advocating for specific ways to advance the motifs and the synergies from their combinations. Advancing and integrating these technologies can enable a robust and efficient hypothesis-simulation-analysis type of scientific method, which we introduce with several use-cases for human-machine teaming and automated science

arXiv.org e-Print Archive

Robust learning algorithms for spiking and rate-based neural networks

Author: Kungl Ákos Ferenc
Publication venue
Publication date: 01/01/2020
Field of study

Inspired by the remarkable properties of the human brain, the fields of machine learning, computational neuroscience and neuromorphic engineering have achieved significant synergistic progress in the last decade. Powerful neural network models rooted in machine learning have been proposed as models for neuroscience and for applications in neuromorphic engineering. However, the aspect of robustness is often neglected in these models. Both biological and engineered substrates show diverse imperfections that deteriorate the performance of computation models or even prohibit their implementation. This thesis describes three projects aiming at implementing robust learning with local plasticity rules in neural networks. First, we demonstrate the advantages of neuromorphic computations in a pilot study on a prototype chip. Thereby, we quantify the speed and energy consumption of the system compared to a software simulation and show how on-chip learning contributes to the robustness of learning. Second, we present an implementation of spike-based Bayesian inference on accelerated neuromorphic hardware. The model copes, via learning, with the disruptive effects of the imperfect substrate and benefits from the acceleration. Finally, we present a robust model of deep reinforcement learning using local learning rules. It shows how backpropagation combined with neuromodulation could be implemented in a biologically plausible framework. The results contribute to the pursuit of robust and powerful learning networks for biological and neuromorphic substrates

Heidelberger Dokumentenserver

Plural Rationality and Interactive Decision Processes; Proceedings of an IIASA Summer Study on Plural Rationality and Interactive Decision Processes, Sopron, Hungary, August 16-26, 1984

Author: Grauer M.
Thompson M.
Wierzbicki A.P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1985
Field of study

These Proceedings report the scientific results of the Summer Study on Plural Rationality and Interactive Decision Processes organized jointly by IIASA and the Hungarian Committee for Applied Systems Analysis. Sixty-eight researchers from sixteen countries participated, most of them contributing papers or experiments. The Study gathered specialists from many disciplines, from philosophy and cultural anthropology, through decision theory, game theory and economics, to engineering and applied mathematics. Twenty-eight of the papers presented during the Study are included in this volume

International Institute for Applied Systems Analysis (IIASA)

International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

Author: Arndt Rafael
Hintermüller Michael
Huber Olivier
Löbhard Caroline
Stengl Steven-Marian
Publication venue
Publication date: 01/01/2019
Field of study

The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics