Search CORE

300 research outputs found

Recommended from our members

Providing Informative Feedback for Learning in Tightly Coupled Multiagent Domains

Author: Rockefeller Golden C.
Publication venue: 'Oregon State University'
Publication date
Field of study

Autonomous agents that sense, decide, act, and coordinate effectively with each other are critical in many real-world domains such as autonomous driving, search and rescue missions, air traffic management, and underwater or deep space exploration. All such domains share a key difficulty: though high-level mission goals are clear to system designers, the agent behaviors that achieve those goals are not. Thus, system designers aim to use adaptive approaches such as reinforcement learning (RL) or evolutionary algorithms (EA) to discover the ideal behaviors for the agents, and these behaviors are often implemented in computational policies (for example as artificial neural networks) that map sensory inputs to actions or values. But for such learning systems to be successful, they need to leverage a system feedback (based on the agents' collective performance) to revise and update the agents' policies for how the agents should interact with the environment. Unfortunately, both RL and EA approaches struggle when the environmental feedback is sparse and/or uninformative, especially in multiagent domains where teasing out an agent’s contribution to the system is difficult. Reward shaping methods address some of this difficulty, but they also suffer when faced with tightly coupled multiagent domains where feedback depends on multiple agents taking the correct joint action at the appropriate time. The contributions of this work is to introduce Reward-Shaped Curriculum Learning, Fitness Critics, and Bidirectional Fitness Critics to address the challenges of sparse feedback in tightly coupled multiagent domains. Reward-Shaped Curriculum Learning trains agents on successively more complex scenarios, which enables agents to use reward shaping to discover the correct actions first and then coordinate for the complex tasks. The impact of this approach is "reduce the sparsity'' of the reward. Fitness Critics directly address the sparse feedback problem by replacing the system reward with a step-by-step performance metric that maps the step-wise observations and actions to meaningful evaluations that are able to identify desirable behaviors. The impact of this approach is to turn a sparse, policy-based reward into a dense, state-action-based reward that trains agents for specific behaviors. Bidirectional Fitness Critics extends Fitness Critics to provide more informative feedback by leveraging the temporal information about the reward and the relevance of that information to the task. The impact of this approach is to more accurately capture the agents' contribution to the desired behavior

ScholarsArchive@OSU

Scaling Up Multiagent Reinforcement Learning for Robotic Systems: Learn an Adaptive Sparse Communication Graph

Author: How Jonathan P.
Shen Macheng
Sun Chuangchuang
Publication venue
Publication date: 03/03/2020
Field of study

The complexity of multiagent reinforcement learning (MARL) in multiagent systems increases exponentially with respect to the agent number. This scalability issue prevents MARL from being applied in large-scale multiagent systems. However, one critical feature in MARL that is often neglected is that the interactions between agents are quite sparse. Without exploiting this sparsity structure, existing works aggregate information from all of the agents and thus have a high sample complexity. To address this issue, we propose an adaptive sparse attention mechanism by generalizing a sparsity-inducing activation function. Then a sparse communication graph in MARL is learned by graph neural networks based on this new attention mechanism. Through this sparsity structure, the agents can communicate in an effective as well as efficient way via only selectively attending to agents that matter the most and thus the scale of the MARL problem is reduced with little optimality compromised. Comparative results show that our algorithm can learn an interpretable sparse structure and outperforms previous works by a significant margin on applications involving a large-scale multiagent system

arXiv.org e-Print Archive

DSpace@MIT

Aerospace Cyber-Physical Systems Education

Author: Atkins Ella M.
Bradley Justin M.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/01/2013
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/106495/1/AIAA2013-4809.pd

Deep Blue Documents at the University of Michigan

Recommended from our members

Multi-Reward Learning and Sparse Rewards

Author: Christopher Reid M.
Publication venue: 'Oregon State University'
Publication date
Field of study

Reinforcement learning has made impressive strides in solving problems in challenging domains, but problems are increasingly being described with sparse rewards. Sparse rewards directly reduce the rate at which useful feedback is provided to the learner and make it difficult to distinguish between what specific actions led to the reception of a reward. This greatly reduces the speed of learning or completely thwarts attempts at learning completely. Some combat the difficulty of learning under sparsity by using multi-reward schemes. These schemes utilize more rewards than just the true system evaluation by doing things like providing exploration incentives or abstracting away a hierarchy of policies, each with different rewards. There are also further techniques that do not rely on multiple rewards, such as reward shaping or transfer learning. A key insight is that these techniques mentioned are orthogonal: multi-reward schemes can receive further benefits by applying other techniques. This project explores various multi-reward strategies and alternative solutions to sparse rewards to find intelligent ways to combine these methods. We provide three specific examples combining intrinsic rewards and transfer learning, imitation learning and policy combination, and hierarchical reinforcement learning and reward shaping in ways that extend the current state-of-the-art. To demonstrate practical usage of these techniques, we describe the application of these techniques to a sparsely rewarded underwater manipulation problem

ScholarsArchive@OSU

Coalition based approach for shop floor agility – a multiagent approach

Author: Oliveira José António Barata de
Publication venue: FCT - UNL
Publication date: 01/01/2003
Field of study

Dissertation submitted for a PhD degree in Electrical Engineering, speciality of Robotics and Integrated Manufacturing from the Universidade Nova de Lisboa, Faculdade de Ciências e TecnologiaThis thesis addresses the problem of shop floor agility. In order to cope with the disturbances and uncertainties that characterise the current business scenarios faced by manufacturing companies, the capability of their shop floors needs to be improved quickly, such that these shop floors may be adapted, changed or become easily modifiable (shop floor reengineering). One of the critical elements in any shop floor reengineering process is the way the control/supervision architecture is changed or modified to accommodate for the new processes and equipment. This thesis, therefore, proposes an architecture to support the fast adaptation or changes in the control/supervision architecture. This architecture postulates that manufacturing systems are no more than compositions of modularised manufacturing components whose interactions when aggregated are governed by contractual mechanisms that favour configuration over reprogramming. A multiagent based reference architecture called Coalition Based Approach for Shop floor Agility – CoBASA, was created to support fast adaptation and changes of shop floor control architectures with minimal effort. The coalitions are composed of agentified manufacturing components (modules), whose relationships within the coalitions are governed by contracts that are configured whenever a coalition is established. Creating and changing a coalition do not involve programming effort because it only requires changes to the contract that regulates it

Repositório da Universidade Nova de Lisboa

Towards Continual Reinforcement Learning: A Review and Perspectives

Author: Khetarpal Khimya
Precup Doina
Riemer Matthew
Rish Irina
Publication venue
Publication date: 24/12/2020
Field of study

In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations and mathematically characterize the non-stationary dynamics of each setting. We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks used in the literature and important metrics for understanding agent performance. Finally, we highlight open problems and challenges in bridging the gap between the current state of continual RL and findings in neuroscience. While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners that can function in increasingly realistic applications where non-stationarity plays a vital role. These include applications such as those in the fields of healthcare, education, logistics, and robotics.Comment: Preprint, 52 pages, 8 figure

arXiv.org e-Print Archive

Research and Education Towards Smart and Sustainable World

Author: Mämmelä Aarne
Riekki Jukka
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/03/2021
Field of study

VTT Research System

Existing Standards Analysis for Alfanet

Author: Barrera Carmen
Boticario Jesús
Koper Rob
Manderveld Jocelyn
Rodríguez Antonio
Santos Olga
Van der Baaren John
Van Es René
Van Rosmalen Peter
Publication venue
Publication date: 28/01/2003
Field of study

Open University of the Netherlands Research Portal