GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control

Jingyi Huang, Yujie Cui, Guipeng Xi, Shuangxia Bai, Bo Li, Geng Wang, Evgeny Neretin

Research output: Contribution to journalArticlepeer-review

Abstract

Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.

Original languageEnglish
Article number275
JournalDrones
Volume9
Issue number4
DOIs
StatePublished - Apr 2025

Keywords

  • multimodal data
  • SAC
  • self-attention mechanism
  • Transformer
  • UAV control decision-making

Fingerprint

Dive into the research topics of 'GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control'. Together they form a unique fingerprint.

Cite this