E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles

Yi Zhang; Yujie Cui; Geng Wang; Bo Li

doi:10.1109/ICCSI62669.2024.10799402

E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles

Yi Zhang, Yujie Cui, Geng Wang, Bo Li

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

As the application of unmanned aerial vehicles(UAVs) in low-altitude airspace continues to broaden, higher requirements have been placed on their autonomous and intelligent manoeuvring and adaptive capabilities. To overcome this challenge, this paper proposes an end-to-end UAV flight decision-making method based on deep reinforcement learning, and provides a dynamic planning scheme for the mission of safely and stably avoiding the threat of environmental obstacles and tracking the target. The method is based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) framework and introduces the Gated Recurrent Unit. To further improve the exploration capability and sample efficiency of the algorithm, we integrate expert experience into reinforcement learning and thus propose the E-TD3 algorithm. We reconstructed the experience replay buffer and designed a mixed sample collection mechanism to dynamically adjust the proportion of demonstration data. Finally, we perform experimental validation on the AirSim platform.

源语言	英语
主期刊名	2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024
出版商	Institute of Electrical and Electronics Engineers Inc.
ISBN（电子版）	9798350376739
DOI	https://doi.org/10.1109/ICCSI62669.2024.10799402
出版状态	已出版 - 2024
活动	2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024 - Doha, 卡塔尔期限: 8 11月 2024 → 12 11月 2024

出版系列

姓名	2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024

会议

会议	2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024
国家/地区	卡塔尔
市	Doha
时期	8/11/24 → 12/11/24

访问文件

10.1109/ICCSI62669.2024.10799402

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, Y., Cui, Y., Wang, G., & Li, B. (2024). E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles. 在 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024 (2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCSI62669.2024.10799402

Zhang, Yi ; Cui, Yujie ; Wang, Geng 等. / E-TD3 : A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles. 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024. Institute of Electrical and Electronics Engineers Inc., 2024. (2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024).

@inproceedings{ad742e13ec224046948c5f8a268b2487,

title = "E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles",

abstract = "As the application of unmanned aerial vehicles(UAVs) in low-altitude airspace continues to broaden, higher requirements have been placed on their autonomous and intelligent manoeuvring and adaptive capabilities. To overcome this challenge, this paper proposes an end-to-end UAV flight decision-making method based on deep reinforcement learning, and provides a dynamic planning scheme for the mission of safely and stably avoiding the threat of environmental obstacles and tracking the target. The method is based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) framework and introduces the Gated Recurrent Unit. To further improve the exploration capability and sample efficiency of the algorithm, we integrate expert experience into reinforcement learning and thus propose the E-TD3 algorithm. We reconstructed the experience replay buffer and designed a mixed sample collection mechanism to dynamically adjust the proportion of demonstration data. Finally, we perform experimental validation on the AirSim platform.",

keywords = "deep reinforcement learning, expert experience, Gated Recurrent Unit, TD3 algorithm, UAV flight decision making",

author = "Yi Zhang and Yujie Cui and Geng Wang and Bo Li",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024 ; Conference date: 08-11-2024 Through 12-11-2024",

year = "2024",

doi = "10.1109/ICCSI62669.2024.10799402",

language = "英语",

series = "2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024",

}

Zhang, Y, Cui, Y, Wang, G & Li, B 2024, E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles. 在 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024. 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024, Institute of Electrical and Electronics Engineers Inc., 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024, Doha, 卡塔尔, 8/11/24. https://doi.org/10.1109/ICCSI62669.2024.10799402

E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles. / Zhang, Yi; Cui, Yujie; Wang, Geng 等.
2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024. Institute of Electrical and Electronics Engineers Inc., 2024. (2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - E-TD3

T2 - 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024

AU - Zhang, Yi

AU - Cui, Yujie

AU - Wang, Geng

AU - Li, Bo

PY - 2024

Y1 - 2024

N2 - As the application of unmanned aerial vehicles(UAVs) in low-altitude airspace continues to broaden, higher requirements have been placed on their autonomous and intelligent manoeuvring and adaptive capabilities. To overcome this challenge, this paper proposes an end-to-end UAV flight decision-making method based on deep reinforcement learning, and provides a dynamic planning scheme for the mission of safely and stably avoiding the threat of environmental obstacles and tracking the target. The method is based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) framework and introduces the Gated Recurrent Unit. To further improve the exploration capability and sample efficiency of the algorithm, we integrate expert experience into reinforcement learning and thus propose the E-TD3 algorithm. We reconstructed the experience replay buffer and designed a mixed sample collection mechanism to dynamically adjust the proportion of demonstration data. Finally, we perform experimental validation on the AirSim platform.

AB - As the application of unmanned aerial vehicles(UAVs) in low-altitude airspace continues to broaden, higher requirements have been placed on their autonomous and intelligent manoeuvring and adaptive capabilities. To overcome this challenge, this paper proposes an end-to-end UAV flight decision-making method based on deep reinforcement learning, and provides a dynamic planning scheme for the mission of safely and stably avoiding the threat of environmental obstacles and tracking the target. The method is based on the Twin Delayed Deep Deterministic Policy Gradient (TD3) framework and introduces the Gated Recurrent Unit. To further improve the exploration capability and sample efficiency of the algorithm, we integrate expert experience into reinforcement learning and thus propose the E-TD3 algorithm. We reconstructed the experience replay buffer and designed a mixed sample collection mechanism to dynamically adjust the proportion of demonstration data. Finally, we perform experimental validation on the AirSim platform.

KW - deep reinforcement learning

KW - expert experience

KW - Gated Recurrent Unit

KW - TD3 algorithm

KW - UAV flight decision making

UR - http://www.scopus.com/inward/record.url?scp=85216542121&partnerID=8YFLogxK

U2 - 10.1109/ICCSI62669.2024.10799402

DO - 10.1109/ICCSI62669.2024.10799402

M3 - 会议稿件

AN - SCOPUS:85216542121

T3 - 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024

BT - 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 8 November 2024 through 12 November 2024

ER -

Zhang Y, Cui Y, Wang G, Li B. E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles. 在 2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024. Institute of Electrical and Electronics Engineers Inc. 2024. (2024 International Conference on Cyber-Physical Social Intelligence, ICCSI 2024). doi: 10.1109/ICCSI62669.2024.10799402

E-TD3: A Deep Reinforcement Learning-based Autonomous Flight Decision-Making Method for Unmanned Aerial Vehicles

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此