Search CORE

81 research outputs found

Adaptive and learning-based formation control of swarm robots

Author: Salimi Mahsoo
Publication venue
Publication date: 14/10/2021
Field of study

Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

Simon Fraser University Institutional Repository

Experience Sharing Between Cooperative Reinforcement Learning Agents

Author: Ralha Celia Ghedini
Ramos Gabriel de Oliveira
Souza Lucas Oliveira
Publication venue
Publication date: 05/11/2019
Field of study

The idea of experience sharing between cooperative agents naturally emerges from our understanding of how humans learn. Our evolution as a species is tightly linked to the ability to exchange learned knowledge with one another. It follows that experience sharing (ES) between autonomous and independent agents could become the key to accelerate learning in cooperative multiagent settings. We investigate if randomly selecting experiences to share can increase the performance of deep reinforcement learning agents, and propose three new methods for selecting experiences to accelerate the learning process. Firstly, we introduce Focused ES, which prioritizes unexplored regions of the state space. Secondly, we present Prioritized ES, in which temporal-difference error is used as a measure of priority. Finally, we devise Focused Prioritized ES, which combines both previous approaches. The methods are empirically validated in a control problem. While sharing randomly selected experiences between two Deep Q-Network agents shows no improvement over a single agent baseline, we show that the proposed ES methods can successfully outperform the baseline. In particular, the Focused ES accelerates learning by a factor of 2, reducing by 51% the number of episodes required to complete the task.Comment: Published at the Proceedings of the 31st IEEE International Conference on Tools with Artificial Intelligenc

arXiv.org e-Print Archive

Crossref

Recommended from our members

Organisations as complex adaptive systems : implications for the design of information systems

Author: Prasad Kumkum
Publication venue
Publication date: 01/01/1999
Field of study

Today a paradigm shift in the field of organisation and management theories is no longer disputed and the need to switch from the Command-and-Control to the Leaming Organisation Paradigm (LOP) in the area of organisational theory is well understood. However, it is less well appreciated that learning organisations cannot operate effectively if supported by centralised databases and tailor-made application programs. LOP emphasises adaptability, flexibility, participation and learning. It is important to understand that the changes in organisational and management strategies will not on their own be able to produce the desired effects unless they are supported by appropriate changes in organisational culture, and by effective information systems. This research demonstrates that conventional information system strategies and development methods are no longer adequate. Information system strategies must respond to these needs of the LOP and incorporate new information systems that are capable of evolving, adapting and responding to the constantly changing business environment. The desired adaptability, flexibility and agility in information systems for LOP can be achieved by exploiting the technologies of the Internet, World Wide Web, intelligent agents and intranets. This research establishes that there is a need for synergy between organisational structures and organisational information systems. To obtain this desired synergy it is essential that new information systems be designed as an integral part of the learning organisational structure itself. Complexity theory provides a new set of metaphors and a host of concepts for the understanding of organisations as complex adaptive systems. This research introduces the principles of Complex Adaptive Systems and draws on their significance for designing the information systems needed to support the new generation of learning organisations. The search for new models of information system strategies for today's dynamic world of business points to the 'swarm models' observed in Nature

Open Research Online (The Open University)

Human-machine communication for educational systems design

Author
Publication venue: Technische Universiteit Eindhoven, Institute for Perception Research
Publication date: 01/01/1993
Field of study

Pure OAI Repository

Human-machine communication for educational systems design

Author
Publication venue: Technische Universiteit Eindhoven, Institute for Perception Research
Publication date: 01/01/1993
Field of study

This book contains the papers presented at the NATO Advanced Study Institute (ASI) on the Basics of man-machine communication for the design of educational systems, held August 16-26, 1993, in Eindhoven, The Netherland

Pure OAI Repository

Learning and Co-operation in Mobile Multi-Robot Systems

Author: Kirke Alexis John
Publication venue: 'University of Plymouth'
Publication date: 01/01/1997
Field of study

Merged with duplicate record 10026.1/1984 on 27.02.2017 by CS (TIS)This thesis addresses the problem of setting the balance between exploration and exploitation in teams of learning robots who exchange information. Specifically it looks at groups of robots whose tasks include moving between salient points in the environment. To deal with unknown and dynamic environments,such robots need to be able to discover and learn the routes between these points themselves. A natural extension of this scenario is to allow the robots to exchange learned routes so that only one robot needs to learn a route for the whole team to use that route. One contribution of this thesis is to identify a dilemma created by this extension: that once one robot has learned a route between two points, all other robots will follow that route without looking for shorter versions. This trade-off will be labeled the Distributed Exploration vs. Exploitation Dilemma, since increasing distributed exploitation (allowing robots to exchange more routes) means decreasing distributed exploration (reducing robots ability to learn new versions of routes), and vice-versa. At different times, teams may be required with different balances of exploitation and exploration. The main contribution of this thesis is to present a system for setting the balance between exploration and exploitation in a group of robots. This system is demonstrated through experiments involving simulated robot teams. The experiments show that increasing and decreasing the value of a parameter of the novel system will lead to a significant increase and decrease respectively in average exploitation (and an equivalent decrease and increase in average exploration) over a series of team missions. A further set of experiments show that this holds true for a range of team sizes and numbers of goals

Plymouth Electronic Archive and Research Library

Self-regulated Multi-robot Task Allocation

Author: Sarker Md Omar Faruque
Publication venue
Publication date: 01/12/2010
Field of study

University of South Wales Research Explorer

Basics of man-machine communication for the design of educational systems : NATO Advanced Study Institute, August 16-26, 1993, Eindhoven, The Netherlands

Author
Publication venue: Technische Universiteit Eindhoven, Institute for Perception Research
Publication date: 01/01/1993
Field of study

Pure OAI Repository

Basics of man-machine communication for the design of educational systems : NATO Advanced Study Institute, August 16-26, 1993, Eindhoven, The Netherlands

Author
Publication venue: Technische Universiteit Eindhoven, Institute for Perception Research
Publication date: 01/01/1993
Field of study

Pure OAI Repository