Search CORE

33,099 research outputs found

Recommended from our members

Cost Efficient Distributed Load Frequency Control in Power Systems

Author: Alonso E.
Apostolopoulou D.
Mello F.
Publication venue: 'Elsevier BV'
Publication date: 14/04/2021
Field of study

The introduction of new technologies and increased penetration of renewable resources is altering the power distribution landscape which now includes a larger numbers of micro-generators. The centralized strategies currently employed for performing frequency control in a cost efficient way need to be revisited and decentralized to conform with the increase of distributed generation in the grid. In this paper, the use of Multi-Agent and Multi-Objective Reinforcement Learning techniques to train models to perform cost efficient frequency control through decentralized decision making is proposed. More specifically, we cast the frequency control problem as a Markov Decision Process and propose the use of reward composition and action composition multi-objective techniques and compare the results between the two. Reward composition is achieved by increasing the dimensionality of the reward function, while action composition is achieved through linear combination of actions produced by multiple single objective models. The proposed framework is validated through comparing the observed dynamics with the acceptable limits enforced in the industry and the cost optimal setups

City Research Online

Decentralization of Multiagent Policies by Learning What to Communicate

Author: Chen Steven W.
Kumar Vijay
Paulos James
Shishika Daigo
Publication venue
Publication date: 25/03/2019
Field of study

Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system's ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert's observability assumptions are broken.Comment: 7 page

arXiv.org e-Print Archive

Crossref

A Deep Reinforcement Learning Framework for Rebalancing Dockless Bike Sharing Systems

Author: Cai Qingpeng
Fang Zhixuan
Huang Longbo
Pan Ling
Tang Pingzhong
Publication venue
Publication date: 02/12/2018
Field of study

Bike sharing provides an environment-friendly way for traveling and is booming all over the world. Yet, due to the high similarity of user travel patterns, the bike imbalance problem constantly occurs, especially for dockless bike sharing systems, causing significant impact on service quality and company revenue. Thus, it has become a critical task for bike sharing systems to resolve such imbalance efficiently. In this paper, we propose a novel deep reinforcement learning framework for incentivizing users to rebalance such systems. We model the problem as a Markov decision process and take both spatial and temporal features into consideration. We develop a novel deep reinforcement learning algorithm called Hierarchical Reinforcement Pricing (HRP), which builds upon the Deep Deterministic Policy Gradient algorithm. Different from existing methods that often ignore spatial information and rely heavily on accurate prediction, HRP captures both spatial and temporal dependencies using a divide-and-conquer structure with an embedded localized module. We conduct extensive experiments to evaluate HRP, based on a dataset from Mobike, a major Chinese dockless bike sharing company. Results show that HRP performs close to the 24-timeslot look-ahead optimization, and outperforms state-of-the-art methods in both service level and bike distribution. It also transfers well when applied to unseen areas

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Credit assignment in multiple goal embodied visuomotor behavior

Author: Ballard Dana H.
Rothkopf Constantin A.
Publication venue
Publication date: 01/01/2010
Field of study

The intrinsic complexity of the brain can lead one to set aside issues related to its relationships with the body, but the field of embodied cognition emphasizes that understanding brain function at the system level requires one to address the role of the brain-body interface. It has only recently been appreciated that this interface performs huge amounts of computation that does not have to be repeated by the brain, and thus affords the brain great simplifications in its representations. In effect the brain’s abstract states can refer to coded representations of the world created by the body. But even if the brain can communicate with the world through abstractions, the severe speed limitations in its neural circuitry mean that vast amounts of indexing must be performed during development so that appropriate behavioral responses can be rapidly accessed. One way this could happen would be if the brain used a decomposition whereby behavioral primitives could be quickly accessed and combined. This realization motivates our study of independent sensorimotor task solvers, which we call modules, in directing behavior. The issue we focus on herein is how an embodied agent can learn to calibrate such individual visuomotor modules while pursuing multiple goals. The biologically plausible standard for module programming is that of reinforcement given during exploration of the environment. However this formulation contains a substantial issue when sensorimotor modules are used in combination: The credit for their overall performance must be divided amongst them. We show that this problem can be solved and that diverse task combinations are beneficial in learning and not a complication, as usually assumed. Our simulations show that fast algorithms are available that allot credit correctly and are insensitive to measurement noise

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Hochschulschriftenserver - Universität Frankfurt am Main