104 research outputs found

    Adaptive low-level control of autonomous underwater vehicles using deep reinforcement learning

    Get PDF
    Low-level control of autonomous underwater vehicles (AUVs) has been extensively addressed by classical control techniques. However, the variable operating conditions and hostile environments faced by AUVs have driven researchers towards the formulation of adaptive control approaches. The reinforcement learning (RL) paradigm is a powerful framework which has been applied in different formulations of adaptive control strategies for AUVs. However, the limitations of RL approaches have lead towards the emergence of deep reinforcement learning which has become an attractive and promising framework for developing real adaptive control strategies to solve complex control problems for autonomous systems. However, most of the existing applications of deep RL use video images to train the decision making artificial agent but obtaining camera images only for an AUV control purpose could be costly in terms of energy consumption. Moreover, the rewards are not easily obtained directly from the video frames. In this work we develop a deep RL framework for adaptive control applications of AUVs based on an actor-critic goal-oriented deep RL architecture, which takes the available raw sensory information as input and as output the continuous control actions which are the low-level commands for the AUV's thrusters. Experiments on a real AUV demonstrate the applicability of the stated deep RL approach for an autonomous robot control problem.Fil: Carlucho, Ignacio. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; ArgentinaFil: de Paula, Mariano. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; ArgentinaFil: Wang, Sen. Heriot-Watt University; Reino UnidoFil: Petillot, Yvan. Heriot-Watt University; Reino UnidoFil: Acosta, Gerardo Gabriel. Universidad Nacional del Centro de la Provincia de Buenos Aires. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Tandil. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires. - Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Centro de Investigaciones en Física e Ingeniería del Centro de la Provincia de Buenos Aires; Argentin

    Saturated Output-Feedback Hybrid Reinforcement Learning Controller for Submersible Vehicles Guaranteeing Output Constraints

    Get PDF
    In this brief, we propose a new neuro-fuzzy reinforcement learning-based control (NFRLC) structure that allows autonomous underwater vehicles (AUVs) to follow a desired trajectory in large-scale complex environments precisely. The accurate tracking control problem is solved by a unique online NFRLC method designed based on actor-critic (AC) structure. Integrating the NFRLC framework including an adaptive multilayer neural network (MNN) and interval type-2 fuzzy neural network (IT2FNN) with a high-gain observer (HGO), a robust smart observer-based system is set up to estimate the velocities of the AUVs, unknown dynamic parameters containing unmodeled dynamics, nonlinearities, uncertainties and external disturbances. By employing a saturation function in the design procedure and transforming the input limitations into input saturation nonlinearities, the risk of the actuator saturation is effectively reduced together with nonlinear input saturation compensation by the NFRLC strategy. A predefined funnel-shaped performance function is designed to attain certain prescribed output performance. Finally, stability study reveals that the entire closed-loop system signals are semi-globally uniformly ultimately bounded (SGUUB) and can provide prescribed convergence rate for the tracking errors so that the tracking errors approach to the origin evolving inside the funnel-shaped performance bound at the prescribed time

    A brief review of neural networks based learning and control and their applications for robots

    Get PDF
    As an imitation of the biological nervous systems, neural networks (NN), which are characterized with powerful learning ability, have been employed in a wide range of applications, such as control of complex nonlinear systems, optimization, system identification and patterns recognition etc. This article aims to bring a brief review of the state-of-art NN for the complex nonlinear systems. Recent progresses of NNs in both theoretical developments and practical applications are investigated and surveyed. Specifically, NN based robot learning and control applications were further reviewed, including NN based robot manipulator control, NN based human robot interaction and NN based behavior recognition and generation

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Sliding mode control with disturbance estimation for underwater robot.

    Get PDF
    This paper proposes a sliding mode control with a disturbance estimation for an underwater robot. The mobility performance of an underwater robot is influenced by modeling error, observation noise, and several disturbances such as ocean current and tidal current. Therefore, a robust control system is needed for precise motion control of an underwater robot. This paper uses a sliding mode control, which is one of the robust control methods. In a sliding mode control, chattering tends to occur, if the switching gain is set to a high value. On the other hand, it is desirable to set the switching gain high from the viewpoint of robustness. Therefore, there is a trade-off between the switching gain and robustness. In the proposed method, the disturbance is estimated in real-time, and this estimated value is added to the control input. Most of the disturbances are compensated by this estimated value, and the sliding mode control is used for the rest of the disturbances. As a result, the robust control system is achieved by using the proposed method, even if the switching gain is set to a low value. The validity of the proposed method was confirmed from the simulation and experimental results

    Development of Robust Control Strategies for Autonomous Underwater Vehicles

    Get PDF
    The resources of the energy and chemical balance in the ocean sustain mankind in many ways. Therefore, ocean exploration is an essential task that is accomplished by deploying Underwater Vehicles. An Underwater Vehicle with autonomy feature for its navigation and control is called Autonomous Underwater Vehicle (AUV). Among the task handled by an AUV, accurately positioning itself at a desired position with respect to the reference objects is called set-point control. Similarly, tracking of the reference trajectory is also another important task. Battery recharging of AUV, positioning with respect to underwater structure, cable, seabed, tracking of reference trajectory with desired accuracy and speed to avoid collision with the guiding vehicle in the last phase of docking are some significant applications where an AUV needs to perform the above tasks. Parametric uncertainties in AUV dynamics and actuator torque limitation necessitate to design robust control algorithms to achieve motion control objectives in the face of uncertainties. Sliding Mode Controller (SMC), H / μ synthesis, model based PID group controllers are some of the robust controllers which have been applied to AUV. But SMC suffers from less efficient tuning of its switching gains due to model parameters and noisy estimated acceleration states appearing in its control law. In addition, demand of high control effort due to high frequency chattering is another drawback of SMC. Furthermore, real-time implementation of H / μ synthesis controller based on its stability study is restricted due to use of linearly approximated dynamic model of an AUV, which hinders achieving robustness. Moreover, model based PID group controllers suffer from implementation complexities and exhibit poor transient and steady-state performances under parametric uncertainties. On the other hand model free Linear PID (LPID) has inherent problem of narrow convergence region, i.e.it can not ensure convergence of large initial error to zero. Additionally, it suffers from integrator-wind-up and subsequent saturation of actuator during the occurrence of large initial error. But LPID controller has inherent capability to cope up with the uncertainties. In view of addressing the above said problem, this work proposes wind-up free Nonlinear PID with Bounded Integral (BI) and Bounded Derivative (BD) for set-point control and combination of continuous SMC with Nonlinear PID with BI and BD namely SM-N-PID with BI and BD for trajectory tracking. Nonlinear functions are used for all P,I and D controllers (for both of set-point and tracking control) in addition to use of nonlinear tan hyperbolic function in SMC(for tracking only) such that torque demand from the controller can be kept within a limit. A direct Lyapunov analysis is pursued to prove stable motion of AUV. The efficacies of the proposed controllers are compared with other two controllers namely PD and N-PID without BI and BD for set-point control and PD plus Feedforward Compensation (FC) and SM-NPID without BI and BD for tracking control. Multiple AUVs cooperatively performing a mission offers several advantages over a single AUV in a non-cooperative manner; such as reliability and increased work efficiency, etc. Bandwidth limitation in acoustic medium possess challenges in designing cooperative motion control algorithm for multiple AUVs owing to the necessity of communication of sensors and actuator signals among AUVs. In literature, undirected graph based approach is used for control design under communication constraints and thus it is not suitable for large number of AUVs participating in a cooperative motion plan. Formation control is a popular cooperative motion control paradigm. This thesis models the formation as a minimally persistent directed graph and proposes control schemes for maintaining the distance constraints during the course of motion of entire formation. For formation control each AUV uses Sliding Mode Nonlinear PID controller with Bounded Integrator and Bounded Derivative. Direct Lyapunov stability analysis in the framework of input-to-state stability ensures the stable motion of formation while maintaining the desired distance constraints among the AUVs

    Haptic identification by ELM-controlled uncertain manipulator

    Get PDF
    This paper presents an extreme learning machine (ELM) based control scheme for uncertain robot manipulators to perform haptic identification. ELM is used to compensate for the unknown nonlinearity in the manipulator dynamics. The ELM enhanced controller ensures that the closed-loop controlled manipulator follows a specified reference model, in which the reference point as well as the feedforward force is adjusted after each trial for haptic identification of geometry and stiffness of an unknown object. A neural learning law is designed to ensure finite-time convergence of the neural weight learning, such that exact matching with the reference model can be achieved after the initial iteration. The usefulness of the proposed method is tested and demonstrated by extensive simulation studies. Index Terms—Extreme learning machine; haptic identification; adaptive control; robot manipulator
    corecore