Article thumbnail

Visual Navigation of Wheeled Mobile Robots Using Deep Reinforcement Learning

By Ezebuugo M Nwaonumah


A study is presented on visual navigation of wheeled mobile robots (WMR) using deep reinforcement learning in unknown and dynamic environments. Two versions of deep reinforcement learning (DRL) algorithms, namely, value-learning based deep Q-network (DQN) and policy gradient based asynchronous advantage actor critic (A3C) have been considered in this study. Both DRL algorithms have been implemented using RGB and depth images as inputs to generate outputs for the WMR for autonomous navigation in both simulation and real-time. The initial DRL networks were generated and trained progressively in simulation environments using OpenAI Gym Gazebo within robot operating system (ROS) framework for a popular experimental WMR, namely, Kobuki TurtleBot2 with Asus Xtion depth camera. The real-time implementation of the trained DRL networks in ROS framework was achieved using onboard edge computing hardware platform of NVIDIA Jetson TX2 through software framework of TensorFlow. For object detection, classification, and target identification, a pre-trained deep neural network, namely, ResNet50 was used after further training with reduced classification categories for target-driven visual mapless navigation of Turlebot2 through DRL. The simulation based training of DQN and A3C networks was successfully transferred with online learning in real-time navigation of Turlebot2 in physical environments. The performance of A3C was simulated with multiple computation threads (4, 6, and 8) on a desktop. The simulated navigation performance, in terms of the minimum, the average, and the maximum rewards, and the completion tine was compared for DQN and A3C networks for three simulation environments. The performance of A3C with multiple threads (4, 6, and 8) was better than DQN, as expected. The performance of A3C also improved with the number of threads. The real-time implementation results of A3C with 8 threads in unknown and dynamic environments with target objects were promising. Details of the methodology, simulation and real-time implementation results are presented and recommendations for future work are outlined

Topics: Reinforcement learning (RL), Deep reinforcement learning (DRL), Asynchronous advantage actor – critic (A3C), Deep neural network (DNN), Mapless navigation, Robot operating system (ROS), Electro-Mechanical Systems, Navigation, Guidance, Control, and Dynamics
Publisher: Digital Commons@Georgia Southern
Year: 2019
OAI identifier:
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • https://digitalcommons.georgia... (external link)
  • https://digitalcommons.georgia... (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.