TY - JOUR
T1 - GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control
AU - Huang, Jingyi
AU - Cui, Yujie
AU - Xi, Guipeng
AU - Bai, Shuangxia
AU - Li, Bo
AU - Wang, Geng
AU - Neretin, Evgeny
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/4
Y1 - 2025/4
N2 - Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.
AB - Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.
KW - multimodal data
KW - SAC
KW - self-attention mechanism
KW - Transformer
KW - UAV control decision-making
UR - http://www.scopus.com/inward/record.url?scp=105003555723&partnerID=8YFLogxK
U2 - 10.3390/drones9040275
DO - 10.3390/drones9040275
M3 - 文章
AN - SCOPUS:105003555723
SN - 2504-446X
VL - 9
JO - Drones
JF - Drones
IS - 4
M1 - 275
ER -