Search CORE

60,420 research outputs found

Deep Reinforcement Learning-Based Channel Allocation for Wireless LANs with Graph Convolutional Networks

Author: Kamiya Shotaro
Morikura Masahiro
Nakashima Kota
Nishio Takayuki
Ohtsu Kazuki
Yamamoto Koji
Publication venue
Publication date: 17/05/2019
Field of study

Last year, IEEE 802.11 Extremely High Throughput Study Group (EHT Study Group) was established to initiate discussions on new IEEE 802.11 features. Coordinated control methods of the access points (APs) in the wireless local area networks (WLANs) are discussed in EHT Study Group. The present study proposes a deep reinforcement learning-based channel allocation scheme using graph convolutional networks (GCNs). As a deep reinforcement learning method, we use a well-known method double deep Q-network. In densely deployed WLANs, the number of the available topologies of APs is extremely high, and thus we extract the features of the topological structures based on GCNs. We apply GCNs to a contention graph where APs within their carrier sensing ranges are connected to extract the features of carrier sensing relationships. Additionally, to improve the learning speed especially in an early stage of learning, we employ a game theory-based method to collect the training data independently of the neural network model. The simulation results indicate that the proposed method can appropriately control the channels when compared to extant methods

arXiv.org e-Print Archive

Crossref

Kyoto University Research Information Repository

A neural network model of adaptively timed reinforcement learning and hippocampal dynamics

Author: Grossberg Stephen
Merrill John W. L.
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/01/1992
Field of study

A neural model is described of how adaptively timed reinforcement learning occurs. The adaptive timing circuit is suggested to exist in the hippocampus, and to involve convergence of dentate granule cells on CA3 pyramidal cells, and NMDA receptors. This circuit forms part of a model neural system for the coordinated control of recognition learning, reinforcement learning, and motor learning, whose properties clarify how an animal can learn to acquire a delayed reward. Behavioral and neural data are summarized in support of each processing stage of the system. The relevant anatomical sites are in thalamus, neocortex, hippocampus, hypothalamus, amygdala, and cerebellum. Cerebellar influences on motor learning are distinguished from hippocampal influences on adaptive timing of reinforcement learning. The model simulates how damage to the hippocampal formation disrupts adaptive timing, eliminates attentional blocking, and causes symptoms of medial temporal amnesia. It suggests how normal acquisition of subcortical emotional conditioning can occur after cortical ablation, even though extinction of emotional conditioning is retarded by cortical ablation. The model simulates how increasing the duration of an unconditioned stimulus increases the amplitude of emotional conditioning, but does not change adaptive timing; and how an increase in the intensity of a conditioned stimulus "speeds up the clock", but an increase in the intensity of an unconditioned stimulus does not. Computer simulations of the model fit parametric conditioning data, including a Weber law property and an inverted U property. Both primary and secondary adaptively timed conditioning are simulated, as are data concerning conditioning using multiple interstimulus intervals (ISIs), gradually or abruptly changing ISis, partial reinforcement, and multiple stimuli that lead to time-averaging of responses. Neurobiologically testable predictions are made to facilitate further tests of the model.Air Force Office of Scientific Research (90-0175, 90-0128); Defense Advanced Research Projects Agency (90-0083); National Science Foundation (IRI-87-16960); Office of Naval Research (N00014-91-J-4100

Boston University Institutional Repository (OpenBU)

Recommended from our members

Reinforcement learning control for coordinated manipulation of multi-robots

Author: Chen Long
Li Qingquan
Li Yanan
Tee Keng Peng
Publication venue: 'Elsevier BV'
Publication date: 10/07/2015
Field of study

In this paper, coordination control is investigated for multi-robots to manipulate an object with a common desired trajectory. Both trajectory tracking and control input minimization are considered for each individual robot manipulator, such that possible disagreement between different manipulators can be handled. Reinforcement learning is employed to cope with the problem of unknown dynamics of both robots and the manipulated object. It is rigorously proven that the proposed method guarantees the coordination control of the multi-robots system under study. The validity of the proposed method is verified through simulation studies

Sussex Research Online

Deep Reinforcement Learning for Multi-Agent Interaction

Author: Ahmed Ibrahim H.
Albrecht Stefano V.
Brewitt Cillian
Carlucho Ignacio
Christianos Filippos
Dunion Mhairi
Fosong Elliot
Garcin Samuel
Guo Shangmin
Gyevnar Balint
McInroe Trevor
Papoudakis Georgios
Rahman Arrasy
Schäfer Lukas
Tamborski Massimiliano
Vecchio Giuseppe
Wang Cheng
Publication venue
Publication date: 02/08/2022
Field of study

The development of autonomous agents which can interact with other agents to accomplish a given task is a core area of research in artificial intelligence and machine learning. Towards this goal, the Autonomous Agents Research Group develops novel machine learning algorithms for autonomous systems control, with a specific focus on deep reinforcement learning and multi-agent reinforcement learning. Research problems include scalable learning of coordinated agent policies and inter-agent communication; reasoning about the behaviours, goals, and composition of other agents from limited observations; and sample-efficient learning based on intrinsic motivation, curriculum learning, causal inference, and representation learning. This article provides a broad overview of the ongoing research portfolio of the group and discusses open problems for future directions.Comment: Published in AI Communications Special Issue on Multi-Agent Systems Research in the U

arXiv.org e-Print Archive

Heriot Watt Pure