81 research outputs found

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Experience Sharing Between Cooperative Reinforcement Learning Agents

    Full text link
    The idea of experience sharing between cooperative agents naturally emerges from our understanding of how humans learn. Our evolution as a species is tightly linked to the ability to exchange learned knowledge with one another. It follows that experience sharing (ES) between autonomous and independent agents could become the key to accelerate learning in cooperative multiagent settings. We investigate if randomly selecting experiences to share can increase the performance of deep reinforcement learning agents, and propose three new methods for selecting experiences to accelerate the learning process. Firstly, we introduce Focused ES, which prioritizes unexplored regions of the state space. Secondly, we present Prioritized ES, in which temporal-difference error is used as a measure of priority. Finally, we devise Focused Prioritized ES, which combines both previous approaches. The methods are empirically validated in a control problem. While sharing randomly selected experiences between two Deep Q-Network agents shows no improvement over a single agent baseline, we show that the proposed ES methods can successfully outperform the baseline. In particular, the Focused ES accelerates learning by a factor of 2, reducing by 51% the number of episodes required to complete the task.Comment: Published at the Proceedings of the 31st IEEE International Conference on Tools with Artificial Intelligenc

    Human-machine communication for educational systems design

    Get PDF

    Human-machine communication for educational systems design

    Get PDF
    This book contains the papers presented at the NATO Advanced Study Institute (ASI) on the Basics of man-machine communication for the design of educational systems, held August 16-26, 1993, in Eindhoven, The Netherland

    Learning and Co-operation in Mobile Multi-Robot Systems

    Get PDF
    Merged with duplicate record 10026.1/1984 on 27.02.2017 by CS (TIS)This thesis addresses the problem of setting the balance between exploration and exploitation in teams of learning robots who exchange information. Specifically it looks at groups of robots whose tasks include moving between salient points in the environment. To deal with unknown and dynamic environments,such robots need to be able to discover and learn the routes between these points themselves. A natural extension of this scenario is to allow the robots to exchange learned routes so that only one robot needs to learn a route for the whole team to use that route. One contribution of this thesis is to identify a dilemma created by this extension: that once one robot has learned a route between two points, all other robots will follow that route without looking for shorter versions. This trade-off will be labeled the Distributed Exploration vs. Exploitation Dilemma, since increasing distributed exploitation (allowing robots to exchange more routes) means decreasing distributed exploration (reducing robots ability to learn new versions of routes), and vice-versa. At different times, teams may be required with different balances of exploitation and exploration. The main contribution of this thesis is to present a system for setting the balance between exploration and exploitation in a group of robots. This system is demonstrated through experiments involving simulated robot teams. The experiments show that increasing and decreasing the value of a parameter of the novel system will lead to a significant increase and decrease respectively in average exploitation (and an equivalent decrease and increase in average exploration) over a series of team missions. A further set of experiments show that this holds true for a range of team sizes and numbers of goals

    Basics of man-machine communication for the design of educational systems : NATO Advanced Study Institute, August 16-26, 1993, Eindhoven, The Netherlands

    Get PDF

    Basics of man-machine communication for the design of educational systems : NATO Advanced Study Institute, August 16-26, 1993, Eindhoven, The Netherlands

    Get PDF
    corecore