Search CORE

975 research outputs found

Adaptive and learning-based formation control of swarm robots

Author: Salimi Mahsoo
Publication venue
Publication date: 14/10/2021
Field of study

Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

Simon Fraser University Institutional Repository

A Symbiotic Brain-Machine Interface through Value-Based Decision Making

Author: A Rangel
AM Graybiel
B Mahmoudi
Babak Mahmoudi
DM Taylor
DV Buonomano
EM Izhikevich
G Montagne
G Paxinos
GH Bower
GJ Mogenson
HJ Groenewegen
HK Kim
J DiGiovanna
J Wessberg
JA Kleim
JC Principe
JC Sanchez
JC Sanchez
JC Sanchez
JC Sanchez
JM Carmena
JM Fuster
Josh Bongard
JP Donoghue
Justin C. Sanchez
K Doya
K Samejima
KV Shenoy
LP Kaelbling
LR Hochberg
M Velliste
MA Lebedev
MAL Nicolelis
MF Roitman
MS Lewicki
MX Cohen
P Dayan
PD Nixon
RA Andersen
RM Costa
RM Costa
RS Sutton
S Grossberg
S Haykin
SH Johnson-Frey
ST Parker
V Ferretti
W Schultz
W Schultz
W Struthers
WA Carlezon Jr
ZM Williams
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

BACKGROUND: In the development of Brain Machine Interfaces (BMIs), there is a great need to enable users to interact with changing environments during the activities of daily life. It is expected that the number and scope of the learning tasks encountered during interaction with the environment as well as the pattern of brain activity will vary over time. These conditions, in addition to neural reorganization, pose a challenge to decoding neural commands for BMIs. We have developed a new BMI framework in which a computational agent symbiotically decoded users' intended actions by utilizing both motor commands and goal information directly from the brain through a continuous Perception-Action-Reward Cycle (PARC). METHODOLOGY: The control architecture designed was based on Actor-Critic learning, which is a PARC-based reinforcement learning method. Our neurophysiology studies in rat models suggested that Nucleus Accumbens (NAcc) contained a rich representation of goal information in terms of predicting the probability of earning reward and it could be translated into an evaluative feedback for adaptation of the decoder with high precision. Simulated neural control experiments showed that the system was able to maintain high performance in decoding neural motor commands during novel tasks or in the presence of reorganization in the neural input. We then implanted a dual micro-wire array in the primary motor cortex (M1) and the NAcc of rat brain and implemented a full closed-loop system in which robot actions were decoded from the single unit activity in M1 based on an evaluative feedback that was estimated from NAcc. CONCLUSIONS: Our results suggest that adapting the BMI decoder with an evaluative feedback that is directly extracted from the brain is a possible solution to the problem of operating BMIs in changing environments with dynamic neural signals. During closed-loop control, the agent was able to solve a reaching task by capturing the action and reward interdependency in the brain

Public Library of Science (PLOS)

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

University of Miami: Scholarship Miami

Deep Reinforcement Learning with Consensus for Manipulators

Author: Liu Wenxing
Publication venue
Publication date: 01/08/2023
Field of study

The University of Manchester - Institutional Repository

Indoor Point-to-Point Navigation with Deep Reinforcement Learning and Ultra-wideband

Author: Chiaberge Marcello
Fantin Giovanni
Mazzia Vittorio
Salvetti Francesco
Sutera Enrico
Publication venue: 'Scitepress'
Publication date: 18/11/2020
Field of study

Indoor autonomous navigation requires a precise and accurate localization system able to guide robots through cluttered, unstructured and dynamic environments. Ultra-wideband (UWB) technology, as an indoor positioning system, offers precise localization and tracking, but moving obstacles and non-line-of-sight occurrences can generate noisy and unreliable signals. That, combined with sensors noise, unmodeled dynamics and environment changes can result in a failure of the guidance algorithm of the robot. We demonstrate how a power-efficient and low computational cost point-to-point local planner, learnt with deep reinforcement learning (RL), combined with UWB localization technology can constitute a robust and resilient to noise short-range guidance system complete solution. We trained the RL agent on a simulated environment that encapsulates the robot dynamics and task constraints and then, we tested the learnt point-to-point navigation policies in a real setting with more than two-hundred experimental evaluations using UWB localization. Our results show that the computational efficient end-to-end policy learnt in plain simulation, that directly maps low-range sensors signals to robot controls, deployed in combination with ultra-wideband noisy localization in a real environment, can provide a robust, scalable and at-the-edge low-cost navigation system solution.Comment: Accepted by ICAART 2021 - http://www.icaart.org

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)