UAV Maneuvering Decision-Making Algorithm Based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Bai Shuangxia; Song Shaomei; Liang Shiyang; Wang Jianmei; Li Bo; Neretin Evgeny

doi:10.37965/jait.2021.12003

UAV Maneuvering Decision-Making Algorithm Based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Bai Shuangxia, Song Shaomei, Liang Shiyang, Wang Jianmei, Li Bo, Neretin Evgeny

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

34 Scopus citations

Abstract

Aiming at intelligent decision-making of unmanned aerial vehicle (UAV) based on situation information in air combat, a novel maneuvering decision method based on deep reinforcement learning is proposed in this paper. The autonomous maneuvering model of UAV is established by Markov Decision Process. The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and the Deep Deterministic Policy Gradient (DDPG) algorithm in deep reinforcement learning are used to train the model, and the experimental results of the two algorithms are analyzed and compared. The simulation experiment results show that compared with the DDPG algorithm, the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems. The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position, speed, and relative azimuth, adjust their actions to approach, and successfully strike the enemy, providing a new method for UAVs to make intelligent maneuvering decisions during air combat.

Original language	English
Pages (from-to)	16-22
Number of pages	7
Journal	Journal of Artificial Intelligence and Technology
Volume	2
Issue number	1
DOIs	https://doi.org/10.37965/jait.2021.12003
State	Published - 25 Jan 2022

Keywords

DDPG
TD3
air combat
maneuvering decision-making

Access to Document

10.37965/jait.2021.12003

Cite this

@article{f7fb680787704c99b6dda4c5d221bf16,

title = "UAV Maneuvering Decision-Making Algorithm Based on Twin Delayed Deep Deterministic Policy Gradient Algorithm",

abstract = "Aiming at intelligent decision-making of unmanned aerial vehicle (UAV) based on situation information in air combat, a novel maneuvering decision method based on deep reinforcement learning is proposed in this paper. The autonomous maneuvering model of UAV is established by Markov Decision Process. The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and the Deep Deterministic Policy Gradient (DDPG) algorithm in deep reinforcement learning are used to train the model, and the experimental results of the two algorithms are analyzed and compared. The simulation experiment results show that compared with the DDPG algorithm, the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems. The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position, speed, and relative azimuth, adjust their actions to approach, and successfully strike the enemy, providing a new method for UAVs to make intelligent maneuvering decisions during air combat.",

keywords = "DDPG, TD3, air combat, maneuvering decision-making",

author = "Bai Shuangxia and Song Shaomei and Liang Shiyang and Wang Jianmei and Li Bo and Neretin Evgeny",

note = "Publisher Copyright: {\textcopyright} The Author(s) 2022.",

year = "2022",

month = jan,

day = "25",

doi = "10.37965/jait.2021.12003",

language = "英语",

volume = "2",

pages = "16--22",

journal = "Journal of Artificial Intelligence and Technology",

issn = "2766-8649",

publisher = "Intelligence Science and Technology Press Inc.",

number = "1",

}

TY - JOUR

T1 - UAV Maneuvering Decision-Making Algorithm Based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

AU - Shuangxia, Bai

AU - Shaomei, Song

AU - Shiyang, Liang

AU - Jianmei, Wang

AU - Bo, Li

AU - Evgeny, Neretin

N1 - Publisher Copyright: © The Author(s) 2022.

PY - 2022/1/25

Y1 - 2022/1/25

N2 - Aiming at intelligent decision-making of unmanned aerial vehicle (UAV) based on situation information in air combat, a novel maneuvering decision method based on deep reinforcement learning is proposed in this paper. The autonomous maneuvering model of UAV is established by Markov Decision Process. The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and the Deep Deterministic Policy Gradient (DDPG) algorithm in deep reinforcement learning are used to train the model, and the experimental results of the two algorithms are analyzed and compared. The simulation experiment results show that compared with the DDPG algorithm, the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems. The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position, speed, and relative azimuth, adjust their actions to approach, and successfully strike the enemy, providing a new method for UAVs to make intelligent maneuvering decisions during air combat.

AB - Aiming at intelligent decision-making of unmanned aerial vehicle (UAV) based on situation information in air combat, a novel maneuvering decision method based on deep reinforcement learning is proposed in this paper. The autonomous maneuvering model of UAV is established by Markov Decision Process. The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and the Deep Deterministic Policy Gradient (DDPG) algorithm in deep reinforcement learning are used to train the model, and the experimental results of the two algorithms are analyzed and compared. The simulation experiment results show that compared with the DDPG algorithm, the TD3 algorithm has stronger decision-making performance and faster convergence speed and is more suitable for solving combat problems. The algorithm proposed in this paper enables UAVs to autonomously make maneuvering decisions based on situation information such as position, speed, and relative azimuth, adjust their actions to approach, and successfully strike the enemy, providing a new method for UAVs to make intelligent maneuvering decisions during air combat.

KW - DDPG

KW - TD3

KW - air combat

KW - maneuvering decision-making

UR - http://www.scopus.com/inward/record.url?scp=85129950714&partnerID=8YFLogxK

U2 - 10.37965/jait.2021.12003

DO - 10.37965/jait.2021.12003

M3 - 文章

AN - SCOPUS:85129950714

SN - 2766-8649

VL - 2

SP - 16

EP - 22

JO - Journal of Artificial Intelligence and Technology

JF - Journal of Artificial Intelligence and Technology

IS - 1

ER -

UAV Maneuvering Decision-Making Algorithm Based on Twin Delayed Deep Deterministic Policy Gradient Algorithm

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this