Search CORE

331 research outputs found

Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning

Author: Acosta Gerardo Gabriel
Carlucho Ignacio
de Paula Mariano
Petillot Yvan
Wang Sen
Publication venue: 'Elsevier BV'
Publication date: 01/09/2018
Field of study

Low-level control of autonomous underwater vehicles (AUVs) has been extensively addressed by classical control techniques. However, the variable operating conditions and hostile environments faced by AUVs have driven researchers towards the formulation of adaptive control approaches. The reinforcement learning (RL) paradigm is a powerful framework which has been applied in different formulations of adaptive control strategies for AUVs. However, the limitations of RL approaches have lead towards the emergence of deep reinforcement learning which has become an attractive and promising framework for developing real adaptive control strategies to solve complex control problems for autonomous systems. However, most of the existing applications of deep RL use video images to train the decision making artificial agent but obtaining camera images only for an AUV control purpose could be costly in terms of energy consumption. Moreover, the rewards are not easily obtained directly from the video frames. In this work we develop a deep RL framework for adaptive control applications of AUVs based on an actor-critic goal-oriented deep RL architecture, which takes the available raw sensory information as input and as output the continuous control actions which are the low-level commands for the AUV's thrusters. Experiments on a real AUV demonstrate the applicability of the stated deep RL approach for an autonomous robot control problem.Fil: Carlucho, Ignacio. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; ArgentinaFil: de Paula, Mariano. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; ArgentinaFil: Wang, Sen. Heriot-Watt University; Reino UnidoFil: Petillot, Yvan. Heriot-Watt University; Reino UnidoFil: Acosta, Gerardo Gabriel. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; Argentin

Heriot Watt Pure

CONICET Digital

Intelligent Navigation for a Solar Powered Unmanned Underwater Vehicle

Author: García Córdova Francisco
Guerrero González Antonio
Publication venue: 'IntechOpen'
Publication date: 01/01/2013
Field of study

In this paper, an intelligent navigation system for an unmanned underwater vehicle powered by renewable energy and designed for shadow water inspection in missions of a long duration is proposed. The system is composed of an underwater vehicle, which tows a surface vehicle. The surface vehicle is a small boat with photovoltaic panels, a methanol fuel cell and communication equipment, which provides energy and communication to the underwater vehicle. The underwater vehicle has sensors to monitor the underwater environment such as sidescan sonar and a video camera in a flexible configuration and sensors to measure the physical and chemical parameters of water quality on predefined paths for long distances. The underwater vehicle implements a biologically inspired neural architecture for autonomous intelligent navigation. Navigation is carried out by integrating a kinematic adaptive neuro‐controller for trajectory tracking and an obstacle avoidance adaptive neuro‐ controller. The autonomous underwater vehicle is capable of operating during long periods of observation and monitoring. This autonomous vehicle is a good tool for observing large areas of sea, since it operates for long periods of time due to the contribution of renewable energy. It correlates all sensor data for time and geodetic position. This vehicle has been used for monitoring the Mar Menor lagoon.Supported by the Coastal Monitoring System for the Mar Menor (CMS‐ 463.01.08_CLUSTER) project founded by the Regional Government of Murcia, by the SICUVA project (Control and Navigation System for AUV Oceanographic Monitoring Missions. REF: 15357/PI/10) founded by the Seneca Foundation of Regional Government of Murcia and by the DIVISAMOS project (Design of an Autonomous Underwater Vehicle for Inspections and oceanographic mission‐UPCT: DPI‐ 2009‐14744‐C03‐02) founded by the Spanish Ministry of Science and Innovation from Spain

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

Repositorio Digital de la Universidad Politécnica de Cartagena

Docking control of an autonomous underwater vehicle using reinforcement learning

Author: Anderlini Enrico
Parker Gordon
Thomas Giles
Publication venue: Digital Commons @ Michigan Tech
Publication date: 21/08/2019
Field of study

To achieve persistent systems in the future, autonomous underwater vehicles (AUVs) will need to autonomously dock onto a charging station. Here, reinforcement learning strategies were applied for the first time to control the docking of an AUV onto a fixed platform in a simulation environment. Two reinforcement learning schemes were investigated: one with continuous state and action spaces, deep deterministic policy gradient (DDPG), and one with continuous state but discrete action spaces, deep Q network (DQN). For DQN, the discrete actions were selected as step changes in the control input signals. The performance of the reinforcement learning strategies was compared with classical and optimal control techniques. The control actions selected by DDPG suffer from chattering effects due to a hyperbolic tangent layer in the actor. Conversely, DQN presents the best compromise between short docking time and low control effort, whilst meeting the docking requirements. Whereas the reinforcement learning algorithms present a very high computational cost at training time, they are five orders of magnitude faster than optimal control at deployment time, thus enabling an on-line implementation. Therefore, reinforcement learning achieves a performance similar to optimal control at a much lower computational cost at deployment, whilst also presenting a more general framework

Michigan Technological University

Reinforcement learning-based multi-AUV adaptive trajectory planning for under-ice field estimation

Author: Mahmoudian Nina
Song Min
Wang Chaofeng
Wang Zhaohui
Wei Li
Publication venue: Digital Commons @ Michigan Tech
Publication date: 01/11/2018
Field of study

This work studies online learning-based trajectory planning for multiple autonomous underwater vehicles (AUVs) to estimate a water parameter field of interest in the under-ice environment. A centralized system is considered, where several fixed access points on the ice layer are introduced as gateways for communications between the AUVs and a remote data fusion center. We model the water parameter field of interest as a Gaussian process with unknown hyper-parameters. The AUV trajectories for sampling are determined on an epoch-by-epoch basis. At the end of each epoch, the access points relay the observed field samples from all the AUVs to the fusion center, which computes the posterior distribution of the field based on the Gaussian process regression and estimates the field hyper-parameters. The optimal trajectories of all the AUVs in the next epoch are determined to maximize a long-term reward that is defined based on the field uncertainty reduction and the AUV mobility cost, subject to the kinematics constraint, the communication constraint and the sensing area constraint. We formulate the adaptive trajectory planning problem as a Markov decision process (MDP). A reinforcement learning-based online learning algorithm is designed to determine the optimal AUV trajectories in a constrained continuous space. Simulation results show that the proposed learning-based trajectory planning algorithm has performance similar to a benchmark method that assumes perfect knowledge of the field hyper-parameters

Michigan Technological University

Directory of Open Access Journals

Rendezvous Planning for Multiple Autonomous Underwater Vehicles using a Markov Decision Process

Author: Griffiths H
Hailes S
Yordanova V
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 20/07/2017
Field of study

Multiple Autonomous Underwater Vehicles (AUVs) are a potential alternative to conventional large manned vessels for mine countermeasure (MCM) operations. Online mission planning for cooperative multi-AUV network often relies or predefined contingency on reactive methods and do not deliver an optimal end-goal performance. Markov Decision Process (MDP) is a decision-making framework that allows an optimal solution, taking into account future decision estimates, rather than having a myopic view. However, most real-world problems are too complex to be represented by this framework. We deal with the complexity problem by abstracting the MCM scenario with a reduced state and action space, yet retaining the information that defines the goal and constraints coming from the application. Another critical part of the model is the ability of the vehicles to communicate and enable a cooperative mission. We use the Rendezvous Point (RP) method. The RP schedules meeting points for the vehicles throughput the mission. Our model provides an optimal action selection solution for the multi-AUV MCM problem. The computation of the mission plan is performed in the order of minutes. This quick execution demonstrates the model is feasible for real-time applications

UCL Discovery

Deep Reinforcement Learning Controller for 3D Path-following and Collision Avoidance by Autonomous Underwater Vehicles

Author: Havenstrøm Simen Theie
Rasheed Adil
San Omer
Publication venue
Publication date: 01/01/2020
Field of study

Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights about the mathematical model governing the physical system. However, in complex systems, such as autonomous underwater vehicles performing the dual objective of path-following and collision avoidance, decision making becomes non-trivial. We propose a solution using state-of-the-art Deep Reinforcement Learning (DRL) techniques, to develop autonomous agents capable of achieving this hybrid objective without having \`a priori knowledge about the goal or the environment. Our results demonstrate the viability of DRL in path-following and avoiding collisions toward achieving human-level decision making in autonomous vehicle systems within extreme obstacle configurations

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

A reinforcement learning path planning approach for range-only underwater target localization with autonomous vehicles

Author: Gomáriz Castro Spartacus
Katija Kakani
Martín Muñoz Mario
Masmitjà Rusiñol Ivan
Navarro Bernabé Joan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Underwater target localization using range-only and single-beacon (ROSB) techniques with autonomous vehicles has been used recently to improve the limitations of more complex methods, such as long baseline and ultra-short baseline systems. Nonetheless, in ROSB target localization methods, the trajectory of the tracking vehicle near the localized target plays an important role in obtaining the best accuracy of the predicted target position. Here, we investigate a Reinforcement Learning (RL) approach to find the optimal path that an autonomous vehicle should follow in order to increase and optimize the overall accuracy of the predicted target localization, while reducing time and power consumption. To accomplish this objective, different experimental tests have been designed using state-of-the-art deep RL algorithms. Our study also compares the results obtained with the analytical Fisher information matrix approach used in previous studies. The results revealed that the policy learned by the RL agent outperforms trajectories based on these analytical solutions, e.g. the median predicted error at the beginning of the target’s localisation is 17% less. These findings suggest that using deep RL for localizing acoustic targets could be successfully applied to in-water applications that include tracking of acoustically tagged marine animals by autonomous underwater vehicles. This is envisioned as a first necessary step to validate the use of RL to tackle such problems, which could be used later on in a more complex scenarios.This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 893089. This work also received financial support from the Spanish Ministerio de Economía y Competitividad (SASES: RTI2018-095112-B-I00; BITER-ECO: PID2020-114732RB C31). This work acknowledges the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S), and from the Generalitat de Catalunya ”Sistemas de Adquisicion Remota de datos y Tratamiento de la Informacion en el Medio Marino (SARTI-MAR)” 2017 SGR 376. We gratefully acknowledge the David and Lucile Packard Foundation.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Digital.CSIC