762 research outputs found

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Revisión de las estrategias de organización en bandadas para enjambres de robots

    Get PDF
    Robotics promises great benefits for human beings, both at the industrial level and concerning personal services. This has led to the continuous development and research in different problems, including control, manipulation, human-machine interaction, and of course, autonomous navigation. Robot swarm systems promise an alternative solution to the classic high-performance platforms, particularly in applications that require task distribution. Among these systems, flocking navigation schemes are currently attracting high attention. To establish a frame of reference, a general review of the literature to date related to flocking behavior, in particular, optimized schemes with some guarantee of safety, is presented. In most of the cases presented, the characteristics of these systems, such as minimal computational and communication requirements, and event-driven planning, are maintained.La robótica promete grandes beneficios, tanto a nivel industrial como con respecto a servicios personales. Esto ha incidido en el continuo desarrollo e investigación en diferentes problemas, entre ellos el control, la manipulación, la interacción hombre-máquina, y por supuesto, la navegación autónoma. Los sistemas de enjambres de robots prometen una alternativa de solución frente a las clásicas plataformas de alto de desempeño, particularmente en aplicaciones que requieren distribución de tareas. Entre estos sistemas, llama la atención los esquemas de navegación en bandada, los cuales tiene actualmente una alta atención. Para establecer un marco de referencia, se presenta una revisión general de la literatura a la fecha relacionada con comportamientos en bandada, en particular esquemas optimizados y con alguna garantía de seguridad. En la mayoría de los casos presentados se mantienen las características de estos sistemas, como son requisitos mínimos de computación y comunicación, y la planificación basada en eventos

    Autonomous Unmanned Aerial Vehicle Navigation using Reinforcement Learning: A Systematic Review

    Get PDF
    There is an increasing demand for using Unmanned Aerial Vehicle (UAV), known as drones, in different applications such as packages delivery, traffic monitoring, search and rescue operations, and military combat engagements. In all of these applications, the UAV is used to navigate the environment autonomously --- without human interaction, perform specific tasks and avoid obstacles. Autonomous UAV navigation is commonly accomplished using Reinforcement Learning (RL), where agents act as experts in a domain to navigate the environment while avoiding obstacles. Understanding the navigation environment and algorithmic limitations plays an essential role in choosing the appropriate RL algorithm to solve the navigation problem effectively. Consequently, this study first identifies the main UAV navigation tasks and discusses navigation frameworks and simulation software. Next, RL algorithms are classified and discussed based on the environment, algorithm characteristics, abilities, and applications in different UAV navigation problems, which will help the practitioners and researchers select the appropriate RL algorithms for their UAV navigation use cases. Moreover, identified gaps and opportunities will drive UAV navigation research

    Study of multi-agent systems with reinforcement learning

    Get PDF
    Collective behavior in biological systems is one of the most fascinating phenomena observed in nature. Many conspecifics form a large group together and behave col- lectively in a highly synchronized fashion. Flocks of birds, schools of fish, swarms of insects, bacterial colonies are some of the examples of such systems. Since the last few years, researchers have studied collective behavior to address challenging questions like how do animals synchronize their motion, how do they interact with each other, how much information about their surroundings do they share, and if there are any general laws that govern the collective behavior in animal groups, etc. Many models have been proposed to address these questions but most of them are still open for answers. In this thesis, we take a brief overview of models proposed from statistical physics to explain the observed collective in animals. We advocate for understanding the collective behavior of animal groups by studying the decision-making process of individual animals within the group. In the first part of this thesis, we investigate the optimal decision-making process of individuals by implementing reinforcement learning techniques. By encouraging congregation of the agents, we observe that the agents learn to form a highly polar ordered state i.e. they all move in the same direction as one unit. Such an ordered state is observed and quantified in a real flock of birds. The optimal strategy that these agents discover is equivalent to the well-known Vicsek model from statistical physics. In the second part, we address the problem of collective search in a turbulent environment using olfactory cues. The agents, far away from the odor source, are tasked with locating the odor source by sensing local cues such as the local velocity of the flow, odor plume etc. By optimally combining the private information (such as local wind, presence/absence of odors, etc.) that the agent has with public information regarding the decisions to navigate made by the other agents in the system, a group of agents complete the given search task more efficiently than as single individuals

    Development and Validation of a LQR-Based Quadcopter Control Dynamics Simulation Model

    Get PDF
    5The growing applications involving unmanned aerial vehicles (UAVs) are requiring more advanced control algorithms to improve rotary-wing UAVs’ performance. To preliminarily tune such advanced controllers, an experimental approach could take a long time and also be dangerous for the vehicle and the onboard hardware components. In this paper, a simulation model of a quadcopter is developed and validated by the comparison of simulation results and experimental data collected during flight tests. For this purpose, an open-source flight controller for quadcopter UAVs is developed and a linear quadratic regulator (LQR) controller is implemented as the control strategy. The input physical quantities are experimentally measured; hence, the LQR controller parameters are tuned on the simulation model. The same tuning is proposed on the developed flight controller with satisfactory results. Finally, flight data and simulation results are compared showing a reliable approximation of the experimental data by the model. Because numerous state-of-the-art simulation models are available, but accurately validated ones are not easy to find, the main purpose of this work is to provide a reliable tool to evaluate the performance for this UAV configuration. DOI: 10.1061/(ASCE)AS.1943-5525.0001336. © 2021 American Society of Civil Engineers.partially_openopenAlessandro Minervini; Simone Godio; Giorgio Guglieri; Fabio Dovis; Alfredo BiciMinervini, Alessandro; Godio, Simone; Guglieri, Giorgio; Dovis, Fabio; Bici, Alfred

    Advancing Expert Human-Computer Interaction Through Music

    Get PDF
    One of the most important challenges for computing over the next decade is discovering ways to augment and extend human control over ever more powerful, complex, and numerous devices and software systems. New high-dimensional input devices and control systems provide these affordances, but require extensive practice and learning on the part of the user. This paper describes a system created to leverage existing human expertise with a complex, highly dimensional interface, in the form of a trained violinist and violin. A machine listening model is employed to provide the musician and user with direct control over a complex simulation running on a high-performance computing system

    Emergence and resilience in multi-agent reinforcement learning

    Get PDF
    Our world represents an enormous multi-agent system (MAS), consisting of a plethora of agents that make decisions under uncertainty to achieve certain goals. The interaction of agents constantly affects our world in various ways, leading to the emergence of interesting phenomena like life forms and civilizations that can last for many years while withstanding various kinds of disturbances. Building artificial MAS that are able to adapt and survive similarly to natural MAS is a major goal in artificial intelligence as a wide range of potential real-world applications like autonomous driving, multi-robot warehouses, and cyber-physical production systems can be straightforwardly modeled as MAS. Multi-agent reinforcement learning (MARL) is a promising approach to build such systems which has achieved remarkable progress in recent years. However, state-of-the-art MARL commonly assumes very idealized conditions to optimize performance in best-case scenarios while neglecting further aspects that are relevant to the real world. In this thesis, we address emergence and resilience in MARL which are important aspects to build artificial MAS that adapt and survive as effectively as natural MAS do. We first focus on emergent cooperation from local interaction of self-interested agents and introduce a peer incentivization approach based on mutual acknowledgments. We then propose to exploit emergent phenomena to further improve coordination in large cooperative MAS via decentralized planning or hierarchical value function factorization. To maintain multi-agent coordination in the presence of partial changes similar to classic distributed systems, we present adversarial methods to improve and evaluate resilience in MARL. Finally, we briefly cover a selection of further topics that are relevant to advance MARL towards real-world applicability.Unsere Welt stellt ein riesiges Multiagentensystem (MAS) dar, welches aus einer Vielzahl von Agenten besteht, die unter Unsicherheit Entscheidungen treffen müssen, um bestimmte Ziele zu erreichen. Die Interaktion der Agenten beeinflusst unsere Welt stets auf unterschiedliche Art und Weise, wodurch interessante emergente Phänomene wie beispielsweise Lebensformen und Zivilisationen entstehen, die über viele Jahre Bestand haben und dabei unterschiedliche Arten von Störungen überwinden können. Die Entwicklung von künstlichen MAS, die ähnlich anpassungs- und überlebensfähig wie natürliche MAS sind, ist eines der Hauptziele in der künstlichen Intelligenz, da viele potentielle Anwendungen wie zum Beispiel das autonome Fahren, die multi-robotergesteuerte Verwaltung von Lagerhallen oder der Betrieb von cyber-phyischen Produktionssystemen, direkt als MAS formuliert werden können. Multi-Agent Reinforcement Learning (MARL) ist ein vielversprechender Ansatz, mit dem in den letzten Jahren bemerkenswerte Fortschritte erzielt wurden, um solche Systeme zu entwickeln. Allerdings geht der Stand der Forschung aktuell von sehr idealisierten Annahmen aus, um die Effektivität ausschließlich für Szenarien im besten Fall zu optimieren. Dabei werden weiterführende Aspekte, die für die echte Welt relevant sind, größtenteils außer Acht gelassen. In dieser Arbeit werden die Aspekte Emergenz und Resilienz in MARL betrachtet, welche wichtig für die Entwicklung von anpassungs- und überlebensfähigen künstlichen MAS sind. Es wird zunächst die Entstehung von emergenter Kooperation durch lokale Interaktion von selbstinteressierten Agenten untersucht. Dazu wird ein Ansatz zur Peer-Incentivierung vorgestellt, welcher auf gegenseitiger Anerkennung basiert. Anschließend werden Ansätze zur Nutzung emergenter Phänomene für die Koordinationsverbesserung in großen kooperativen MAS präsentiert, die dezentrale Planungsverfahren oder hierarchische Faktorisierung von Evaluationsfunktionen nutzen. Zur Aufrechterhaltung der Multiagentenkoordination bei partiellen Veränderungen, ähnlich wie in klassischen verteilten Systemen, werden Methoden des Adversarial Learning vorgestellt, um die Resilienz in MARL zu verbessern und zu evaluieren. Abschließend wird kurz eine Auswahl von weiteren Themen behandelt, die für die Einsatzfähigkeit von MARL in der echten Welt relevant sind
    corecore