TY - GEN
T1 - Deep Reinforcement Learning for Visual Navigation of Wheeled Mobile Robots
AU - Nwaonumah, Ezebuugo
AU - Samanta, Biswanath
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/3/28
Y1 - 2020/3/28
N2 - A study is presented on applying deep reinforcement learning (DRL) for visual navigation of wheeled mobile robots (WMR) in dynamic and unknown environments. Two DRL algorithms, namely, value-learning deep Q-network (DQN) and policy gradient based asynchronous advantage actor critic (A 3C), have been considered. RGB (red, green and blue) and depth images have been used as inputs in implementation of both DRL algorithms to generate control commands for autonomous navigation of WMR in simulation environments. The initial DRL networks were generated and trained progressively in OpenAI Gym Gazebo based simulation environments within robot operating system (ROS) framework for a popular target WMR, Kobuki TurtleBot2. A pre-trained deep neural network ResNet50 was used after further training with regrouped objects commonly found in laboratory setting for target-driven mapless visual navigation of Turlebot2 through DRL. The performance of A 3C with multiple computation threads (4, 6, and 8) was simulated on a desktop. The navigation performance of DQN and A 3C networks, in terms of reward statistics and completion time, was compared in three simulation environments. As expected, A 3C with multiple threads (4, 6, and 8) performed better than DQN and the performance of A 3C improved with number of threads. Details of the methodology, simulation results are presented and recommendations for future work towards real-time implementation through transfer learning of the DRL models are outlined.
AB - A study is presented on applying deep reinforcement learning (DRL) for visual navigation of wheeled mobile robots (WMR) in dynamic and unknown environments. Two DRL algorithms, namely, value-learning deep Q-network (DQN) and policy gradient based asynchronous advantage actor critic (A 3C), have been considered. RGB (red, green and blue) and depth images have been used as inputs in implementation of both DRL algorithms to generate control commands for autonomous navigation of WMR in simulation environments. The initial DRL networks were generated and trained progressively in OpenAI Gym Gazebo based simulation environments within robot operating system (ROS) framework for a popular target WMR, Kobuki TurtleBot2. A pre-trained deep neural network ResNet50 was used after further training with regrouped objects commonly found in laboratory setting for target-driven mapless visual navigation of Turlebot2 through DRL. The performance of A 3C with multiple computation threads (4, 6, and 8) was simulated on a desktop. The navigation performance of DQN and A 3C networks, in terms of reward statistics and completion time, was compared in three simulation environments. As expected, A 3C with multiple threads (4, 6, and 8) performed better than DQN and the performance of A 3C improved with number of threads. Details of the methodology, simulation results are presented and recommendations for future work towards real-time implementation through transfer learning of the DRL models are outlined.
KW - ResNet50
KW - asynchronous advantage actor-critic (A3C)
KW - convolutional neural network (CNN)
KW - deep neural network (DNN)
KW - deep reinforcement learning (DRL)
KW - machine learning (ML)
KW - mapless navigation
KW - reinforcement learning (RL)
KW - robot operating system (ROS)
KW - robotics
UR - http://www.scopus.com/inward/record.url?scp=85097811119&partnerID=8YFLogxK
U2 - 10.1109/SoutheastCon44009.2020.9249654
DO - 10.1109/SoutheastCon44009.2020.9249654
M3 - Conference article
AN - SCOPUS:85097811119
T3 - Conference Proceedings - IEEE SOUTHEASTCON
BT - IEEE SoutheastCon 2020, SoutheastCon 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE SoutheastCon, SoutheastCon 2020
Y2 - 28 March 2020 through 29 March 2020
ER -