700 research outputs found
Collisionless Pattern Discovery in Robot Swarms Using Deep Reinforcement Learning
We present a deep reinforcement learning-based framework for automatically
discovering patterns available in any given initial configuration of fat robot
swarms. In particular, we model the problem of collision-less gathering and
mutual visibility in fat robot swarms and discover patterns for solving them
using our framework. We show that by shaping reward signals based on certain
constraints like mutual visibility and safe proximity, the robots can discover
collision-less trajectories leading to well-formed gathering and visibility
patterns
Learning a Swarm Foraging Behavior with Microscopic Fuzzy Controllers Using Deep Reinforcement Learning
This article presents a macroscopic swarm foraging behavior obtained using deep reinforcement learning. The selected behavior is a complex task in which a group of simple agents must be directed towards an object to move it to a target position without the use of special gripping mechanisms, using only their own bodies. Our system has been designed to use and combine basic fuzzy behaviors to control obstacle avoidance and the low-level rendezvous processes needed for the foraging task. We use a realistically modeled swarm based on differential robots equipped with light detection and ranging (LiDAR) sensors. It is important to highlight that the obtained macroscopic behavior, in contrast to that of end-to-end systems, combines existing microscopic tasks, which allows us to apply these learning techniques even with the dimensionality and complexity of the problem in a realistic robotic swarm system. The presented behavior is capable of correctly developing the macroscopic foraging task in a robust and scalable way, even in situations that have not been seen in the training phase. An exhaustive analysis of the obtained behavior is carried out, where both the movement of the swarm while performing the task and the swarm scalability are analyzed.This work was supported by the Ministerio de Ciencia, Innovación y Universidades (Spain), project RTI2018-096219-B-I00. Project co-financed with FEDER funds
Adaptive and learning-based formation control of swarm robots
Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation
Improving Robotic Decision-Making in Unmodeled Situations
Existing methods of autonomous robotic decision-making are often fragile when faced with inaccurate or incompletely modeled distributions of uncertainty, also known as ambiguity. While decision-making under ambiguity is a field of study that has been gaining interest, many existing methods tend to be computationally challenging, require many assumptions about the nature of the problem, and often require much prior knowledge. Therefore, they do not scale well to complex real-world problems where fulfilling all of these requirements is often impractical if not impossible. The research described in this dissertation investigates novel approaches to robotic decision-making strategies which are resilient to ambiguity that are not subject to as many of these requirements as most existing methods. The novel frameworks described in this research incorporate physical feedback, diversity, and swarm local interactions, three factors that are hypothesized to be key in creating resilience to ambiguity. These three factors are inspired by examples of robots which demonstrate resilience to ambiguity, ranging from simple vibrobots to decentralized robotic swarms. The proposed decision-making methods, based around a proposed framework known as Ambiguity Trial and Error (AT&E), are tested for both single robots and robotic swarms in several simulated robotic foraging case studies, and a real-world robotic foraging experiment. A novel method for transferring swarm resilience properties back to single agent decision-making is also explored. The results from the case studies show that the proposed methods demonstrate resilience to varying types of ambiguities, both stationary and non-stationary, while not requiring accurate modeling and assumptions, large amounts of prior training data, or computationally expensive decision-making policy solvers. Conclusions about these novel methods are then drawn from the simulation and experiment results and the future research directions leveraging the lessons learned from this research are discussed
CoMIX: A Multi-agent Reinforcement Learning Training Architecture for Efficient Decentralized Coordination and Independent Decision Making
Robust coordination skills enable agents to operate cohesively in shared
environments, together towards a common goal and, ideally, individually without
hindering each other's progress. To this end, this paper presents Coordinated
QMIX (CoMIX), a novel training framework for decentralized agents that enables
emergent coordination through flexible policies, allowing at the same time
independent decision-making at individual level. CoMIX models selfish and
collaborative behavior as incremental steps in each agent's decision process.
This allows agents to dynamically adapt their behavior to different situations
balancing independence and collaboration. Experiments using a variety of
simulation environments demonstrate that CoMIX outperforms baselines on
collaborative tasks. The results validate our incremental policy approach as
effective technique for improving coordination in multi-agent systems
A Comprehensive Review on Autonomous Navigation
The field of autonomous mobile robots has undergone dramatic advancements
over the past decades. Despite achieving important milestones, several
challenges are yet to be addressed. Aggregating the achievements of the robotic
community as survey papers is vital to keep the track of current
state-of-the-art and the challenges that must be tackled in the future. This
paper tries to provide a comprehensive review of autonomous mobile robots
covering topics such as sensor types, mobile robot platforms, simulation tools,
path planning and following, sensor fusion methods, obstacle avoidance, and
SLAM. The urge to present a survey paper is twofold. First, autonomous
navigation field evolves fast so writing survey papers regularly is crucial to
keep the research community well-aware of the current status of this field.
Second, deep learning methods have revolutionized many fields including
autonomous navigation. Therefore, it is necessary to give an appropriate
treatment of the role of deep learning in autonomous navigation as well which
is covered in this paper. Future works and research gaps will also be
discussed
Evolutionary Swarm Robotics using Epigenetics Learning in Dynamic Environment
Intelligent robots have been widely studied and investigated to replace, fulfilling a complex mission in a hazardous environment. Lately, swarm robotics, a group of collaborative robots, has become popular because it offers benefits over a single intelligent system. Many strategies have been developed to achieve collective and decentralised control applying evolutionary algorithms. However, since the evolutionary algorithm relies principally on an individual fitness function to explore the solution space, achieving swarm robotics' collaborative behaviour in a dynamic environment becomes a problem. This is due to the lack of adaptation in most of the evolutionary methods. In order to thrive in such environment, external stimuli and rewards from the environment should be utilised as ``knowledge'' to achieve the intelligent behaviour currently lacking in evolutionary swarm robotics. The aims of this research are: (1) to develop novel reward-based evolutionary swarm learning using mechanisms of epigenetic inheritance; and (2) to identify an efficient learning method for the epigenetic layer achieving a decision-making strategy in a dynamic environment.
This research's contributions are the development of reward-based co-learning algorithm and co-evolution using epigenetic-based knowledge backup. The reward-based co-learning algorithm enables the swarm to obtain knowledge of the dynamic environment and override the objective-based function to evaluate internal and external problems. An advantage of this is that the learning mechanism also enables the swarm to explore potentially better behaviour without the constraint of an ill-defined objective function. Simulated search-and-rescue missions using a swarm of UAVs shows that individual behaviour evolves differently although each member has the same physical characteristics and the same set of actions. As an addition to reward-based multi-agent learning mechanisms, epigenetics is introduced as a decision-making layer. The epigenetic layer has two functions: there are genetic regulators, as well as an epigenetic inheritance (the epigenetic mechanism). The first is the function of an epigenetic layer regulating how genetic information is expressed as agent’s behaviour (the ``phenotype''). Thus, utilising the regulatory function, the agent is able to switch genetic strategy or decision-making based on external stimulus from the aforementioned reward-based learning. The second function is that epigenetic inheritance enables sharing of genetic regulation and decision-making layer between agents.
In summary, this research extends the current literature on evolutionary swarm robotics and decentralised multi-agent learning mechanisms. The combination of both advances the decentralised mechanism in obtaining information and improve collective behaviour
- …