Deep-reinforcement-learning-based UAV autonomous navigation and collision avoidance in unknown environments

Fei WANG, Xiaoping ZHU, Zhou ZHOU, Yang TANG

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

In some military application scenarios, Unmanned Aerial Vehicles (UAVs) need to perform missions with the assistance of on-board cameras when radar is not available and communication is interrupted, which brings challenges for UAV autonomous navigation and collision avoidance. In this paper, an improved deep-reinforcement-learning algorithm, Deep Q-Network with a Faster R-CNN model and a Data Deposit Mechanism (FRDDM-DQN), is proposed. A Faster R-CNN model (FR) is introduced and optimized to obtain the ability to extract obstacle information from images, and a new replay memory Data Deposit Mechanism (DDM) is designed to train an agent with a better performance. During training, a two-part training approach is used to reduce the time spent on training as well as retraining when the scenario changes. In order to verify the performance of the proposed method, a series of experiments, including training experiments, test experiments, and typical episodes experiments, is conducted in a 3D simulation environment. Experimental results show that the agent trained by the proposed FRDDM-DQN has the ability to navigate autonomously and avoid collisions, and performs better compared to the FR-DQN, FR-DDQN, FR-Dueling DQN, YOLO-based YDDM-DQN, and original FR output-based FR-ODQN.

Original languageEnglish
Pages (from-to)237-257
Number of pages21
JournalChinese Journal of Aeronautics
Volume37
Issue number3
DOIs
StatePublished - Mar 2024

Keywords

  • Faster R-CNN model
  • Image-based Autonomous Navigation and Collision Avoidance (ANCA)
  • Replay memory Data Deposit Mechanism (DDM)
  • Two-part training approach
  • Unmanned Aerial Vehicle (UAV)

Fingerprint

Dive into the research topics of 'Deep-reinforcement-learning-based UAV autonomous navigation and collision avoidance in unknown environments'. Together they form a unique fingerprint.

Cite this