8 research outputs found

    A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking

    No full text
    The unmanned aerial vehicle (UAV) trajectory tracking control algorithm based on deep reinforcement learning is generally inefficient for training in an unknown environment, and the convergence is unstable. Aiming at this situation, a Markov decision process (MDP) model for UAV trajectory tracking is established, and a state-compensated deep deterministic policy gradient (CDDPG) algorithm is proposed. An additional neural network (C-Net) whose input is compensation state and output is compensation action is added to the network model of a deep deterministic policy gradient (DDPG) algorithm to assist in network exploration training. It combined the action output of the DDPG network with compensated output of the C-Net as the output action to interact with the environment, enabling the UAV to rapidly track dynamic targets in the most accurate continuous and smooth way possible. In addition, random noise is added on the basis of the generated behavior to realize a certain range of exploration and make the action value estimation more accurate. The OpenAI Gym tool is used to verify the proposed method, and the simulation results show that: (1) The proposed method can significantly improve the training efficiency by adding a compensation network and effectively improve the accuracy and convergence stability; (2) Under the same computer configuration, the computational cost of the proposed algorithm is basically the same as that of the QAC algorithm (Actor-critic algorithm based on behavioral value Q) and the DDPG algorithm; (3) During the training process, with the same tracking accuracy, the learning efficiency is about 70% higher than that of QAC and DDPG; (4) During the simulation tracking experiment, under the same training time, the tracking error of the proposed method after stabilization is about 50% lower than that of QAC and DDPG

    A State-Compensated Deep Deterministic Policy Gradient Algorithm for UAV Trajectory Tracking

    No full text
    The unmanned aerial vehicle (UAV) trajectory tracking control algorithm based on deep reinforcement learning is generally inefficient for training in an unknown environment, and the convergence is unstable. Aiming at this situation, a Markov decision process (MDP) model for UAV trajectory tracking is established, and a state-compensated deep deterministic policy gradient (CDDPG) algorithm is proposed. An additional neural network (C-Net) whose input is compensation state and output is compensation action is added to the network model of a deep deterministic policy gradient (DDPG) algorithm to assist in network exploration training. It combined the action output of the DDPG network with compensated output of the C-Net as the output action to interact with the environment, enabling the UAV to rapidly track dynamic targets in the most accurate continuous and smooth way possible. In addition, random noise is added on the basis of the generated behavior to realize a certain range of exploration and make the action value estimation more accurate. The OpenAI Gym tool is used to verify the proposed method, and the simulation results show that: (1) The proposed method can significantly improve the training efficiency by adding a compensation network and effectively improve the accuracy and convergence stability; (2) Under the same computer configuration, the computational cost of the proposed algorithm is basically the same as that of the QAC algorithm (Actor-critic algorithm based on behavioral value Q) and the DDPG algorithm; (3) During the training process, with the same tracking accuracy, the learning efficiency is about 70% higher than that of QAC and DDPG; (4) During the simulation tracking experiment, under the same training time, the tracking error of the proposed method after stabilization is about 50% lower than that of QAC and DDPG

    A Lightweight and Drift-Free Fusion Strategy for Drone Autonomous and Safe Navigation

    No full text
    Self-localization and state estimation are crucial capabilities for agile drone autonomous navigation. This article presents a lightweight and drift-free vision-IMU-GNSS tightly coupled multisensor fusion (LDMF) strategy for drones’ autonomous and safe navigation. The drone is carried out with a front-facing camera to create visual geometric constraints and generate a 3D environmental map. Ulteriorly, a GNSS receiver with multiple constellations support is used to continuously provide pseudo-range, Doppler frequency shift and UTC time pulse signals to the drone navigation system. The proposed multisensor fusion strategy leverages the Kanade–Lucas algorithm to track multiple visual features in each input image. The local graph solution is bounded in a restricted sliding window, which can immensely predigest the computational complexity in factor graph optimization procedures. The drone navigation system can achieve camera-rate performance on a small companion computer. We thoroughly experimented with the LDMF system in both simulated and real-world environments, and the results demonstrate dramatic advantages over the state-of-the-art sensor fusion strategies

    Factors Driving Microbial Community Dynamics and Potential Health Effects of Bacterial Pathogen on Landscape Lakes with Reclaimed Water Replenishment in Beijing, PR China

    No full text
    Assessing the bacteria pathogens in the lakes with reclaimed water as major influents are important for public health. This study investigated microbial communities of five landscape lakes replenished by reclaimed water, then analyzed driven factors and identified health effects of bacterial pathogens. 16S rRNA gene sequence analysis demonstrated that Proteobacteria, Actinobacteria, Cyanobacteria, Firmicutes, and Verrucomicrobia were the most dominant phyla in five landscape lakes. The microbial community diversities were higher in June and July than that in other months. Temperature, total nitrogen and phosphorus were the main drivers of the dominant microbial from the Redundancy analysis (RDA) results. Various potential bacterial pathogens were identified, including Pseudomonas, GKS98_freshwater_group, Sporosarcina, Pseudochrobactrum, Streptomyces and Bacillus, etc, some of which are easily infectious to human. The microbial network analysis showed that some potential pathogens were nodes that had significant health effects. The work provides a basis for understanding the microbial community dynamics and safety issues for health effects in landscape lakes replenished by reclaimed water

    Perceiving like a Bat: Hierarchical 3D Geometric–Semantic Scene Understanding Inspired by a Biomimetic Mechanism

    No full text
    Geometric–semantic scene understanding is a spatial intelligence capability that is essential for robots to perceive and navigate the world. However, understanding a natural scene remains challenging for robots because of restricted sensors and time-varying situations. In contrast, humans and animals are able to form a complex neuromorphic concept of the scene they move in. This neuromorphic concept captures geometric and semantic aspects of the scenario and reconstructs the scene at multiple levels of abstraction. This article seeks to reduce the gap between robot and animal perception by proposing an ingenious scene-understanding approach that seamlessly captures geometric and semantic aspects in an unexplored environment. We proposed two types of biologically inspired environment perception methods, i.e., a set of elaborate biomimetic sensors and a brain-inspired parsing algorithm related to scene understanding, that enable robots to perceive their surroundings like bats. Our evaluations show that the proposed scene-understanding system achieves competitive performance in image semantic segmentation and volumetric–semantic scene reconstruction. Moreover, to verify the practicability of our proposed scene-understanding method, we also conducted real-world geometric–semantic scene reconstruction in an indoor environment with our self-developed drone

    A Supervised Reinforcement Learning Algorithm for Controlling Drone Hovering

    No full text
    The application of drones carrying different devices for aerial hovering operations is becoming increasingly widespread, but currently there is very little research relying on reinforcement learning methods for hovering control, and it has not been implemented on physical machines. Drone’s behavior space regarding hover control is continuous and large-scale, making it difficult for basic algorithms and value-based reinforcement learning (RL) algorithms to have good results. In response to this issue, this article applies a watcher-actor-critic (WAC) algorithm to the drone’s hover control, which can quickly lock the exploration direction and achieve high robustness of the drone’s hover control while improving learning efficiency and reducing learning costs. This article first utilizes the actor-critic algorithm based on behavioral value Q (QAC) and the deep deterministic policy gradient algorithm (DDPG) for drone hover control learning. Subsequently, an actor-critic algorithm with an added watcher is proposed, in which the watcher uses a PID controller with parameters provided by a neural network as the dynamic monitor, transforming the learning process into supervised learning. Finally, this article uses a classic reinforcement learning environment library, Gym, and a current mainstream reinforcement learning framework, PARL, for simulation, and deploys the algorithm to a practical environment. A multi-sensor fusion strategy-based autonomous localization method for unmanned aerial vehicles is used for practical exercises. The simulation and experimental results show that the training episodes of WAC are reduced by 20% compared to the DDPG and 55% compared to the QAC, and the proposed algorithm has a higher learning efficiency, faster convergence speed, and smoother hovering effect compared to the QAC and DDPG

    RRVPE: A Robust and Real-Time Visual-Inertial-GNSS Pose Estimator for Aerial Robot Navigation

    No full text
    Self-localization and orientation estimation are the essential capabilities for mobile robot navigation. In this article, a robust and real-time visual-inertial-GNSS(Global Navigation Satellite System) tightly coupled pose estimation (RRVPE) method for aerial robot navigation is presented. The aerial robot carries a front-facing stereo camera for self-localization and an RGB-D camera to generate 3D voxel map. Ulteriorly, a GNSS receiver is used to continuously provide pseudorange, Doppler frequency shift and universal time coordinated (UTC) pulse signals to the pose estimator. The proposed system leverages the Kanade Lucas algorithm to track Shi-Tomasi features in each video frame, and the local factor graph solution process is bounded in a circumscribed container, which can immensely abandon the computational complexity in nonlinear optimization procedure. The proposed robot pose estimator can achieve camera-rate (30 Hz) performance on the aerial robot companion computer. We thoroughly experimented the RRVPE system in both simulated and practical circumstances, and the results demonstrate dramatic advantages over the state-of-the-art robot pose estimators
    corecore