Search CORE

13 research outputs found

Apprenticeship Bootstrapping for Autonomous Aerial Shepherding of Ground Swarm

Author: Nguyen Hung
Publication venue: UNSW, Sydney
Publication date: 01/01/2023
Field of study

Aerial shepherding of ground vehicles (ASGV) musters a group of uncrewed ground vehicles (UGVs) from the air using uncrewed aerial vehicles (UAVs). This inspiration enables robust uncrewed ground-air coordination where one or multiple UAVs effectively drive a group of UGVs towards a goal. Developing artificial intelligence (AI) agents for ASGV is a non-trivial task due to the sub-tasks, multiple skills, and their non-linear interaction required to synthesise a solution. One approach to developing AI agents is Imitation learning (IL), where humans demonstrate the task to the machine. However, gathering human data from complex tasks in human-swarm interaction (HSI) requires the human to perform the entire job, which could lead to unexpected errors caused by a lack of control skills and human workload due to the length and complexity of ASGV. We hypothesise that we can bootstrap the overall task by collecting human data from simpler sub-tasks to limit errors and workload for humans. Therefore, this thesis attempts to answer the primary research question of how to design IL algorithms for multiple agents. We propose a new learning scheme called Apprenticeship Bootstrapping (AB). In AB, the low-level behaviours of the shepherding agents are trained from human data using our proposed hierarchical IL algorithms. The high-level behaviours are then formed using a proposed gesture demonstration framework to collect human data from synthesising more complex controllers. The transferring mechanism is performed by aggregating the proposed IL algorithms. Experiments are designed using a mixed environment, where the UAV flies in a simulated robotic Gazebo environment, while the UGVs are physical vehicles in a natural environment. A system is designed to allow switching between humans controlling the UAVs using low-level actions and humans controlling the UAVs using high-level actions. The former enables data collection for developing autonomous agents for sub-tasks. At the same time, in the latter, humans control the UAV by issuing commands that call the autonomous agents for the sub-tasks. We baseline the learnt agents against Str\"{o}mbom scripted behaviours and show that the system can successfully generate autonomous behaviours for ASGV

UNSWorks

Generative AI for Unmanned Vehicle Swarms: Challenges, Applications and Opportunities

Author: Du Hongyang
Hoang Dinh Thai
Jamalipour Abbas
Kang Jiawen
Kim Dong In
Liu Guangyuan
Niyato Dusit
Van Huynh Nguyen
Xiong Zehui
Zhu Kun
Publication venue
Publication date: 28/02/2024
Field of study

With recent advances in artificial intelligence (AI) and robotics, unmanned vehicle swarms have received great attention from both academia and industry due to their potential to provide services that are difficult and dangerous to perform by humans. However, learning and coordinating movements and actions for a large number of unmanned vehicles in complex and dynamic environments introduce significant challenges to conventional AI methods. Generative AI (GAI), with its capabilities in complex data feature extraction, transformation, and enhancement, offers great potential in solving these challenges of unmanned vehicle swarms. For that, this paper aims to provide a comprehensive survey on applications, challenges, and opportunities of GAI in unmanned vehicle swarms. Specifically, we first present an overview of unmanned vehicles and unmanned vehicle swarms as well as their use cases and existing issues. Then, an in-depth background of various GAI techniques together with their capabilities in enhancing unmanned vehicle swarms are provided. After that, we present a comprehensive review on the applications and challenges of GAI in unmanned vehicle swarms with various insights and discussions. Finally, we highlight open issues of GAI in unmanned vehicle swarms and discuss potential research directions.Comment: 23 page

arXiv.org e-Print Archive

Reinforcement learning-based autonomous robot navigation and tracking

Author: Lin Feiqiang
Publication venue
Publication date
Field of study

Autonomous navigation requires determining a collision-free path for a mobile robot using only partial observations of the environment. This capability is highly needed for a wide range of applications, such as search and rescue operations, surveillance, environmental monitoring, and domestic service robots. In many scenarios, an accurate global map is not available beforehand, posing significant challenges for a robot planning its path. This type of navigation is often referred to as Mapless Navigation, and such work is not limited to only Unmanned Ground Vehicle (UGV) but also other vehicles, such as Unmanned Aerial Vehicles (UAV) and more. This research aims to develop Reinforcement Learning (RL)-based methods for autonomous navigation for mobile robots, as well as effective tracking strategies for a UAV to follow a moving target. Mapless navigation usually assumes accurate localisation, which is unrealistic. In the real world, localisation methods, such as simultaneous localisation and mapping (SLAM), are needed. However, the localisation performance could deteriorate depending on the environment and observation quality. Therefore, To avoid de-teriorated localisation, this work introduces an RL-based navigation algorithm to enable mobile robots to navigate in unknown environments, while incorporating localisation performance in training the policy. Specifically, a localisation-related penalty is introduced in the reward space, ensuring localisation safety is taken into consideration during navigation. Different metrics are formulated to identify if the localisation performance starts to deteriorate in order to penalise the robot. As such, the navigation policy will not only optimise its paths in terms of travel distance and collision avoidance towards the goal but also avoid venturing into areas that pose challenges for localisation algorithms. The localisation-safe algorithm is further extended to UAV navigation, which uses image-based observations. Instead of deploying an end-to-end control pipeline, this work establishes a hierarchical control framework that leverages both the capabilities of neural networks for perception and the stability and safety guarantees of conventional controllers. The high-level controller in this hierarchical framework is a neural network policy with semantic image inputs, trained using RL algorithms with localisation-related rewards. The efficacy of the trained policy is demonstrated in real-world experiments for localisation-safe navigation, and, notably, it exhibits effectiveness without the need for retraining, thanks to the hierarchical control scheme and semantic inputs. Last, a tracking policy is introduced to enable a UAV to track a moving target. This study designs a reward space, enabling a vision-based UAV, which utilises depth images for perception, to follow a target within a safe and visible range. The objective is to maintain the mobile target at the centre of the drone camera’s image without being occluded by other objects and to avoid collisions with obstacles. It is observed that training such a policy from scratch may lead to local minima. To address this, a state-based teacher policy is trained to perform the tracking task, with environmental perception relying on direct access to state information, including position coordinates of obstacles, instead of depth images. An RL algorithm is then constructed to train the vision-based policy, incorporating behavioural guidance from the state-based teacher policy. This approach yields promising tracking performance

Online Research @ Cardiff

Guidance, navigation and control of multirotors

Author: Rubí Perelló Bertomeu
Publication venue: Universitat Politècnica de Catalunya
Publication date: 11/12/2020
Field of study

Aplicat embargament des de la data de defensa fins el dia 31 de desembre de 2021This thesis presents contributions to the Guidance, Navigation and Control (GNC) systems for multirotor vehicles by applying and developing diverse control techniques and machine learning theory with innovative results. The aim of the thesis is to obtain a GNC system able to make the vehicle follow predefined paths while avoiding obstacles in the vehicle's route. The system must be adaptable to different paths, situations and missions, reducing the tuning effort and parametrisation of the proposed approaches. The multirotor platform, formed by the Asctec Hummingbird quadrotor vehicle, is studied and described in detail. A complete mathematical model is obtained and a freely available and open simulation platform is built. Furthermore, an autopilot controller is designed and implemented in the real platform. The control part is focused on the path following problem. That is, following a predefined path in space without any time constraint. Diverse control-oriented and geometrical algorithms are studied, implemented and compared. Then, the geometrical algorithms are improved by obtaining adaptive approaches that do not need any parameter tuning. The adaptive geometrical approaches are developed by means of Neural Networks. To end up, a deep reinforcement learning approach is developed to solve the path following problem. This approach implements the Deep Deterministic Policy Gradient algorithm. The resulting approach is trained in a realistic multirotor simulator and tested in real experiments with success. The proposed approach is able to accurately follow a path while adapting the vehicle's velocity depending on the path's shape. In the navigation part, an obstacle detection system based on the use of a LIDAR sensor is implemented. A model of the sensor is derived and included in the simulator. Moreover, an approach for treating the sensor data to eliminate the possible ground detections is developed. The guidance part is focused on the reactive path planning problem. That is, a path planning algorithm that is able to re-plan the trajectory online if an unexpected event, such as detecting an obstacle in the vehicle's route, occurs. A deep reinforcement learning approach for the reactive obstacle avoidance problem is developed. This approach implements the Deep Deterministic Policy Gradient algorithm. The developed deep reinforcement learning agent is trained and tested in the realistic simulation platform. This agent is combined with the path following agent and the rest of the elements developed in the thesis obtaining a GNC system that is able to follow different types of paths while avoiding obstacle in the vehicle's route.Aquesta tesi doctoral presenta diverses contribucions relaciones amb els sistemes de Guiat, Navegació i Control (GNC) per a vehicles multirrotor, aplicant i desenvolupant diverses tècniques de control i de machine learning amb resultats innovadors. L'objectiu principal de la tesi és obtenir un sistema de GNC capaç de dirigir el vehicle perquè segueixi una trajectòria predefinida mentre evita els obstacles que puguin aparèixer en el recorregut del vehicle. El sistema ha de ser adaptable a diferents trajectòries, situacions i missions, reduint l'esforç realitzat en l'ajust i la parametrització dels mètodes proposats. La plataforma experimental, formada pel cuadricòpter Asctec Hummingbird, s'estudia i es descriu en detall. S'obté un model matemàtic complet de la plataforma i es desenvolupa una eina de simulació, la qual és de codi lliure. A més, es dissenya un controlador autopilot i s'implementa en la plataforma real. La part de control està enfocada al problema de path following. En aquest problema, el vehicle ha de seguir una trajectòria predefinida en l'espai sense cap tipus de restricció temporal. S'estudien, s'implementen i es comparen diversos algoritmes de control i geomètrics de path following. Després, es milloren els algoritmes geomètrics usant xarxes neuronals per convertirlos en algoritmes adaptatius. Per finalitzar, es desenvolupa un mètode de path following basat en tècniques d'aprenentatge per reforç profund (deep Reinforcement learning). Aquest mètode implementa l'algoritme Deep Deterministic Policy Gradient. L'agent intel. ligent resultant és entrenat en un simulador realista de multirotors i validat en la plataforma experimental real amb èxit. Els resultats mostren que l'agent és capaç de seguir de forma precisa la trajectòria de referència adaptant la velocitat del vehicle segons la curvatura del recorregut. A la part de navegació, s'implementa un sistema de detecció d'obstacles basat en l'ús d'un sensor LIDAR. Es deriva un model del sensor i aquest s'inclou en el simulador. A més, es desenvolupa un mètode per tractar les mesures del sensor per eliminar les possibles deteccions del terra. Pel que fa a la part de guiatge, aquesta està focalitzada en el problema de reactive path planning. És a dir, un algoritme de planificació de trajectòria que és capaç de re-planejar el recorregut del vehicle a l'instant si algun esdeveniment inesperat ocorre, com ho és la detecció d'un obstacle en el recorregut del vehicle. Es desenvolupa un mètode basat en aprenentatge per reforç profund per l'evasió d'obstacles. Aquest mètode implementa l'algoritme Deep Deterministic Policy Gradient. L'agent d'aprenentatge per reforç s'entrena i valida en un simulador de multirotors realista. Aquest agent es combina amb l'agent de path following i la resta d'elements desenvolupats en la tesi per obtenir un sistema GNC capaç de seguir diferents tipus de trajectòries, evadint els obstacles que estiguin en el recorregut del vehicle.Esta tesis doctoral presenta varias contribuciones relacionas con los sistemas de Guiado, Navegación y Control (GNC) para vehículos multirotor, aplicando y desarrollando diversas técnicas de control y de machine learning con resultados innovadores. El objetivo principal de la tesis es obtener un sistema de GNC capaz de dirigir el vehículo para que siga una trayectoria predefinida mientras evita los obstáculos que puedan aparecer en el recorrido del vehículo. El sistema debe ser adaptable a diferentes trayectorias, situaciones y misiones, reduciendo el esfuerzo realizado en el ajuste y la parametrización de los métodos propuestos. La plataforma experimental, formada por el cuadricoptero Asctec Hummingbird, se estudia y describe en detalle. Se obtiene un modelo matemático completo de la plataforma y se desarrolla una herramienta de simulación, la cual es de código libre. Además, se diseña un controlador autopilot, el cual es implementado en la plataforma real. La parte de control está enfocada en el problema de path following. En este problema, el vehículo debe seguir una trayectoria predefinida en el espacio tridimensional sin ninguna restricción temporal Se estudian, implementan y comparan varios algoritmos de control y geométricos de path following. Luego, se mejoran los algoritmos geométricos usando redes neuronales para convertirlos en algoritmos adaptativos. Para finalizar, se desarrolla un método de path following basado en técnicas de aprendizaje por refuerzo profundo (deep reinforcement learning). Este método implementa el algoritmo Deep Deterministic Policy Gradient. El agente inteligente resultante es entrenado en un simulador realista de multirotores y validado en la plataforma experimental real con éxito. Los resultados muestran que el agente es capaz de seguir de forma precisa la trayectoria de referencia adaptando la velocidad del vehículo según la curvatura del recorrido. En la parte de navegación se implementa un sistema de detección de obstáculos basado en el uso de un sensor LIDAR. Se deriva un modelo del sensor y este se incluye en el simulador. Además, se desarrolla un método para tratar las medidas del sensor para eliminar las posibles detecciones del suelo. En cuanto a la parte de guiado, está focalizada en el problema de reactive path planning. Es decir, un algoritmo de planificación de trayectoria que es capaz de re-planear el recorrido del vehículo al instante si ocurre algún evento inesperado, como lo es la detección de un obstáculo en el recorrido del vehículo. Se desarrolla un método basado en aprendizaje por refuerzo profundo para la evasión de obstáculos. Este implementa el algoritmo Deep Deterministic Policy Gradient. El agente de aprendizaje por refuerzo se entrena y valida en un simulador de multirotors realista. Este agente se combina con el agente de path following y el resto de elementos desarrollados en la tesis para obtener un sistema GNC capaz de seguir diferentes tipos de trayectorias evadiendo los obstáculos que estén en el recorrido del vehículo.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Multi-Robot Systems: Challenges, Trends and Applications

Author
Publication venue: 'MDPI AG'
Publication date: 06/05/2022
Field of study

This book is a printed edition of the Special Issue entitled “Multi-Robot Systems: Challenges, Trends, and Applications” that was published in Applied Sciences. This Special Issue collected seventeen high-quality papers that discuss the main challenges of multi-robot systems, present the trends to address these issues, and report various relevant applications. Some of the topics addressed by these papers are robot swarms, mission planning, robot teaming, machine learning, immersive technologies, search and rescue, and social robotics

Directory of Open Access Books (DOAB)

Advances in Automated Driving Systems

Author
Publication venue: 'MDPI AG'
Publication date: 06/07/2022
Field of study

Electrification, automation of vehicle control, digitalization and new mobility are the mega-trends in automotive engineering, and they are strongly connected. While many demonstrations for highly automated vehicles have been made worldwide, many challenges remain in bringing automated vehicles to the market for private and commercial use. The main challenges are as follows: reliable machine perception; accepted standards for vehicle-type approval and homologation; verification and validation of the functional safety, especially at SAE level 3+ systems; legal and ethical implications; acceptance of vehicle automation by occupants and society; interaction between automated and human-controlled vehicles in mixed traffic; human–machine interaction and usability; manipulation, misuse and cyber-security; the system costs of hard- and software and development efforts. This Special Issue was prepared in the years 2021 and 2022 and includes 15 papers with original research related to recent advances in the aforementioned challenges. The topics of this Special Issue cover: Machine perception for SAE L3+ driving automation; Trajectory planning and decision-making in complex traffic situations; X-by-Wire system components; Verification and validation of SAE L3+ systems; Misuse, manipulation and cybersecurity; Human–machine interactions, driver monitoring and driver-intention recognition; Road infrastructure measures for the introduction of SAE L3+ systems; Solutions for interactions between human- and machine-controlled vehicles in mixed traffic

Directory of Open Access Books (DOAB)

Robotics, AI, and Humanity

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access book examines recent advances in how artificial intelligence (AI) and robotics have elicited widespread debate over their benefits and drawbacks for humanity. The emergent technologies have for instance implications within medicine and health care, employment, transport, manufacturing, agriculture, and armed conflict. While there has been considerable attention devoted to robotics/AI applications in each of these domains, a fuller picture of their connections and the possible consequences for our shared humanity seems needed. This volume covers multidisciplinary research, examines current research frontiers in AI/robotics and likely impacts on societal well-being, human – robot relationships, as well as the opportunities and risks for sustainable development and peace. The attendant ethical and religious dimensions of these technologies are addressed and implications for regulatory policies on the use and future development of AI/robotics technologies are elaborated

OAPEN Library

Foundations of Trusted Autonomy

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Trusted Autonomy; Automation Technology; Autonomous Systems; Self-Governance; Trusted Autonomous Systems; Design of Algorithms and Methodologie

OAPEN Library

Swarm Robotics

Author: Spezzano Giandomenico
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Collectively working robot teams can solve a problem more efficiently than a single robot, while also providing robustness and flexibility to the group. Swarm robotics model is a key component of a cooperative algorithm that controls the behaviors and interactions of all individuals. The robots in the swarm should have some basic functions, such as sensing, communicating, and monitoring, and satisfy the following properties

Directory of Open Access Books (DOAB)