2,284 research outputs found
Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning
Low-level control of autonomous underwater vehicles (AUVs) has been extensively addressed by classical control techniques. However, the variable operating conditions and hostile environments faced by AUVs have driven researchers towards the formulation of adaptive control approaches. The reinforcement learning (RL) paradigm is a powerful framework which has been applied in different formulations of adaptive control strategies for AUVs. However, the limitations of RL approaches have lead towards the emergence of deep reinforcement learning which has become an attractive and promising framework for developing real adaptive control strategies to solve complex control problems for autonomous systems. However, most of the existing applications of deep RL use video images to train the decision making artificial agent but obtaining camera images only for an AUV control purpose could be costly in terms of energy consumption. Moreover, the rewards are not easily obtained directly from the video frames. In this work we develop a deep RL framework for adaptive control applications of AUVs based on an actor-critic goal-oriented deep RL architecture, which takes the available raw sensory information as input and as output the continuous control actions which are the low-level commands for the AUV's thrusters. Experiments on a real AUV demonstrate the applicability of the stated deep RL approach for an autonomous robot control problem.Fil: Carlucho, Ignacio. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; ArgentinaFil: de Paula, Mariano. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; ArgentinaFil: Wang, Sen. Heriot-Watt University; Reino UnidoFil: Petillot, Yvan. Heriot-Watt University; Reino UnidoFil: Acosta, Gerardo Gabriel. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; Argentin
Deep Reinforcement Learning Controller for 3D Path-following and Collision Avoidance by Autonomous Underwater Vehicles
Control theory provides engineers with a multitude of tools to design
controllers that manipulate the closed-loop behavior and stability of dynamical
systems. These methods rely heavily on insights about the mathematical model
governing the physical system. However, in complex systems, such as autonomous
underwater vehicles performing the dual objective of path-following and
collision avoidance, decision making becomes non-trivial. We propose a solution
using state-of-the-art Deep Reinforcement Learning (DRL) techniques, to develop
autonomous agents capable of achieving this hybrid objective without having \`a
priori knowledge about the goal or the environment. Our results demonstrate the
viability of DRL in path-following and avoiding collisions toward achieving
human-level decision making in autonomous vehicle systems within extreme
obstacle configurations
Intelligent Navigation for a Solar Powered Unmanned Underwater Vehicle
In this paper, an intelligent navigation system for
an unmanned underwater vehicle powered by renewable
energy and designed for shadow water inspection in
missions of a long duration is proposed. The system is
composed of an underwater vehicle, which tows a surface
vehicle. The surface vehicle is a small boat with
photovoltaic panels, a methanol fuel cell and
communication equipment, which provides energy and
communication to the underwater vehicle. The underwater
vehicle has sensors to monitor the underwater
environment such as sidescan sonar and a video camera in
a flexible configuration and sensors to measure the
physical and chemical parameters of water quality on
predefined paths for long distances. The underwater
vehicle implements a biologically inspired neural
architecture for autonomous intelligent navigation.
Navigation is carried out by integrating a kinematic
adaptive neuro‐controller for trajectory tracking and an
obstacle avoidance adaptive neuro‐ controller. The
autonomous underwater vehicle is capable of operating
during long periods of observation and monitoring. This
autonomous vehicle is a good tool for observing large areas
of sea, since it operates for long periods of time due to the
contribution of renewable energy. It correlates all sensor
data for time and geodetic position. This vehicle has been
used for monitoring the Mar Menor lagoon.Supported by the Coastal Monitoring
System for the Mar Menor (CMS‐ 463.01.08_CLUSTER)
project founded by the Regional Government of Murcia,
by the SICUVA project (Control and Navigation System
for AUV Oceanographic Monitoring Missions. REF:
15357/PI/10) founded by the Seneca Foundation of
Regional Government of Murcia and by the DIVISAMOS
project (Design of an Autonomous Underwater Vehicle
for Inspections and oceanographic mission‐UPCT: DPI‐
2009‐14744‐C03‐02) founded by the Spanish Ministry of
Science and Innovation from Spain
Platform-portable reinforcement learning methods to localize underwater targets
In this study, we present a platform-portable deep reinforcement learning method that has been used as a path-planning system to localize underwater objects with autonomous vehicles.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 893089. This work also received financial support from the Spanish Ministerio de Economía y Competitividad (BITERECO: PID2020-114732RBC31). This work acknowledges the ’Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S).Peer ReviewedPostprint (author's final draft
Improving the energy efficiency of autonomous underwater vehicles by learning to model disturbances
Energy efficiency is one of the main challenges for long-term autonomy of AUVs (Autonomous Underwater Vehicles). We propose a novel approach for improving the energy efficiency of AUV controllers based on the ability to learn which external disturbances can safely be ignored. The proposed learning approach uses adaptive oscillators that are able to learn online the frequency, amplitude and phase of zero-mean periodic external disturbances. Such disturbances occur naturally in open water due to waves, currents, and gravity, but also can be caused by the dynamics and hydrodynamics of the AUV itself. We formulate the theoretical basis of the approach, and demonstrate its abilities on a number of input signals. Further experimental evaluation is conducted using a dynamic model of the Girona 500 AUV in simulation on two important underwater scenarios: hovering and trajectory tracking. The proposed approach shows significant energy-saving capabilities while at the same time maintaining high controller gains. The approach is generic and applicable not only for AUV control, but also for other type of control where periodic disturbances exist and could be accounted for by the controller. © 2013 IEEE
Cooperative Marine Operations via Ad Hoc Teams
While research in ad hoc teamwork has great potential for solving real-world
robotic applications, most developments so far have been focusing on
environments with simple dynamics. In this article, we discuss how the problem
of ad hoc teamwork can be of special interest for marine robotics and how it
can aid marine operations. Particularly, we present a set of challenges that
need to be addressed for achieving ad hoc teamwork in underwater environments
and we discuss possible solutions based on current state-of-the-art
developments in the ad hoc teamwork literature
Docking control of an autonomous underwater vehicle using reinforcement learning
To achieve persistent systems in the future, autonomous underwater vehicles (AUVs) will need to autonomously dock onto a charging station. Here, reinforcement learning strategies were applied for the first time to control the docking of an AUV onto a fixed platform in a simulation environment. Two reinforcement learning schemes were investigated: one with continuous state and action spaces, deep deterministic policy gradient (DDPG), and one with continuous state but discrete action spaces, deep Q network (DQN). For DQN, the discrete actions were selected as step changes in the control input signals. The performance of the reinforcement learning strategies was compared with classical and optimal control techniques. The control actions selected by DDPG suffer from chattering effects due to a hyperbolic tangent layer in the actor. Conversely, DQN presents the best compromise between short docking time and low control effort, whilst meeting the docking requirements. Whereas the reinforcement learning algorithms present a very high computational cost at training time, they are five orders of magnitude faster than optimal control at deployment time, thus enabling an on-line implementation. Therefore, reinforcement learning achieves a performance similar to optimal control at a much lower computational cost at deployment, whilst also presenting a more general framework
Adaptive and learning-based formation control of swarm robots
Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation
- …