TY - JOUR
T1 - Deep-reinforcement-learning-based UAV autonomous navigation and collision avoidance in unknown environments
AU - WANG, Fei
AU - ZHU, Xiaoping
AU - ZHOU, Zhou
AU - TANG, Yang
N1 - Publisher Copyright:
© 2024 Chinese Society of Aeronautics and Astronautics
PY - 2024/3
Y1 - 2024/3
N2 - In some military application scenarios, Unmanned Aerial Vehicles (UAVs) need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted, which brings challenges for UAV autonomous navigation and collision avoidance. In this paper, an improved deep-reinforcement-learning algorithm, Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism (FRDDM-DQN), is proposed. A Faster R-CNN model (FR) is introduced and optimized to obtain the ability to extract obstacle information from images, and a new replay memory Data Deposit Mechanism (DDM) is designed to train an agent with a better performance. During training, a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes. In order to verify the performance of the proposed method, a series of experiments, including training experiments, test experiments, and typical episodes experiments, is conducted in a 3D simulation environment. Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions, and performs better compared to the FR-DQN, FR-DDQN, FR-Dueling DQN, YOLO-based YDDM-DQN, and original FR output-based FR-ODQN.
AB - In some military application scenarios, Unmanned Aerial Vehicles (UAVs) need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted, which brings challenges for UAV autonomous navigation and collision avoidance. In this paper, an improved deep-reinforcement-learning algorithm, Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism (FRDDM-DQN), is proposed. A Faster R-CNN model (FR) is introduced and optimized to obtain the ability to extract obstacle information from images, and a new replay memory Data Deposit Mechanism (DDM) is designed to train an agent with a better performance. During training, a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes. In order to verify the performance of the proposed method, a series of experiments, including training experiments, test experiments, and typical episodes experiments, is conducted in a 3D simulation environment. Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions, and performs better compared to the FR-DQN, FR-DDQN, FR-Dueling DQN, YOLO-based YDDM-DQN, and original FR output-based FR-ODQN.
KW - Faster R-CNN model
KW - Image-based Autonomous Navigation and Collision Avoidance (ANCA)
KW - Replay memory Data Deposit Mechanism (DDM)
KW - Two-part training approach
KW - Unmanned Aerial Vehicle (UAV)
UR - http://www.scopus.com/inward/record.url?scp=85184055357&partnerID=8YFLogxK
U2 - 10.1016/j.cja.2023.09.033
DO - 10.1016/j.cja.2023.09.033
M3 - 文章
AN - SCOPUS:85184055357
SN - 1000-9361
VL - 37
SP - 237
EP - 257
JO - Chinese Journal of Aeronautics
JF - Chinese Journal of Aeronautics
IS - 3
ER -