GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control

Jingyi Huang; Yujie Cui; Guipeng Xi; Shuangxia Bai; Bo Li; Geng Wang; Evgeny Neretin

doi:10.3390/drones9040275

GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control

Jingyi Huang, Yujie Cui, Guipeng Xi, Shuangxia Bai, Bo Li, Geng Wang, Evgeny Neretin

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

Abstract

Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.

Original language	English
Article number	275
Journal	Drones
Volume	9
Issue number	4
DOIs	https://doi.org/10.3390/drones9040275
State	Published - Apr 2025

Keywords

SAC
Transformer
UAV control decision-making
multimodal data
self-attention mechanism

Access to Document

10.3390/drones9040275

Cite this

@article{6895e0f921584faebd626d2558b2afbd,

title = "GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control",

abstract = "Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL{\textquoteright}s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.",

keywords = "SAC, Transformer, UAV control decision-making, multimodal data, self-attention mechanism",

author = "Jingyi Huang and Yujie Cui and Guipeng Xi and Shuangxia Bai and Bo Li and Geng Wang and Evgeny Neretin",

note = "Publisher Copyright: {\textcopyright} 2025 by the authors.",

year = "2025",

month = apr,

doi = "10.3390/drones9040275",

language = "英语",

volume = "9",

journal = "Drones",

issn = "2504-446X",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "4",

}

TY - JOUR

T1 - GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control

AU - Huang, Jingyi

AU - Cui, Yujie

AU - Xi, Guipeng

AU - Bai, Shuangxia

AU - Li, Bo

AU - Wang, Geng

AU - Neretin, Evgeny

PY - 2025/4

Y1 - 2025/4

N2 - Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.

AB - Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%.

KW - SAC

KW - Transformer

KW - UAV control decision-making

KW - multimodal data

KW - self-attention mechanism

UR - http://www.scopus.com/inward/record.url?scp=105003555723&partnerID=8YFLogxK

U2 - 10.3390/drones9040275

DO - 10.3390/drones9040275

M3 - 文章

AN - SCOPUS:105003555723

SN - 2504-446X

VL - 9

JO - Drones

JF - Drones

IS - 4

M1 - 275

ER -

GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this