基于MADDPG的多无人机协同任务决策

Bo Li; Kai Qiang Yue; Zhi Gang Gan; Pei Xin Gao

doi:10.3873/j.issn.1000-1328.2021.06.009

基于MADDPG的多无人机协同任务决策

Bo Li, Kai Qiang Yue, Zhi Gang Gan, Pei Xin Gao

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

24 引用（Scopus）

摘要

Aiming at the problem that the traditional optimization algorithm is difficult to get the desired results in a short time in the research of multi-UAV (unmanned aerial vehicle) task decision-making method, this paper proposes a multi-agent deep deterministic policy gradient (MADDPG) algorithm based on deep reinforcement learning. It allows UAVs to use global information in learning and only local information in application decision-making. The model structure of MADDPG algorithm is designed. Finally, through simulation experiments and comparing with deep deterministic policy gradient (DDPG) algorithm, it is verified that the MADDPG algorithm proposed in this paper can greatly improve the learning speed on the basis of ensuring the accuracy, and make up for the shortcomings of the traditional reinforcement learning algorithm in the field of multiple agents.

投稿的翻译标题	Multi-UAV Cooperative Autonomous Navigation Based on Multi-agent Deep Deterministic Policy Gradient
源语言	繁体中文
页（从-至）	757-765
页数	9
期刊	Yuhang Xuebao/Journal of Astronautics
卷	42
期	6
DOI	https://doi.org/10.3873/j.issn.1000-1328.2021.06.009
出版状态	已出版 - 30 6月 2021

关键词

Deep reinforcement learning
Multi-agent
Policy gradient
Task decision-making
UAV

访问文件

10.3873/j.issn.1000-1328.2021.06.009

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{60cf1660821d4ed9a07b2fdb31b7f03c,

title = "基于MADDPG的多无人机协同任务决策",

abstract = "Aiming at the problem that the traditional optimization algorithm is difficult to get the desired results in a short time in the research of multi-UAV (unmanned aerial vehicle) task decision-making method, this paper proposes a multi-agent deep deterministic policy gradient (MADDPG) algorithm based on deep reinforcement learning. It allows UAVs to use global information in learning and only local information in application decision-making. The model structure of MADDPG algorithm is designed. Finally, through simulation experiments and comparing with deep deterministic policy gradient (DDPG) algorithm, it is verified that the MADDPG algorithm proposed in this paper can greatly improve the learning speed on the basis of ensuring the accuracy, and make up for the shortcomings of the traditional reinforcement learning algorithm in the field of multiple agents.",

keywords = "Deep reinforcement learning, Multi-agent, Policy gradient, Task decision-making, UAV",

author = "Bo Li and Yue, {Kai Qiang} and Gan, {Zhi Gang} and Gao, {Pei Xin}",

year = "2021",

month = jun,

day = "30",

doi = "10.3873/j.issn.1000-1328.2021.06.009",

language = "繁体中文",

volume = "42",

pages = "757--765",

journal = "Yuhang Xuebao/Journal of Astronautics",

issn = "1000-1328",

publisher = "Chinese Society of Astronautics",

number = "6",

}

TY - JOUR

T1 - 基于MADDPG的多无人机协同任务决策

AU - Li, Bo

AU - Yue, Kai Qiang

AU - Gan, Zhi Gang

AU - Gao, Pei Xin

PY - 2021/6/30

Y1 - 2021/6/30

N2 - Aiming at the problem that the traditional optimization algorithm is difficult to get the desired results in a short time in the research of multi-UAV (unmanned aerial vehicle) task decision-making method, this paper proposes a multi-agent deep deterministic policy gradient (MADDPG) algorithm based on deep reinforcement learning. It allows UAVs to use global information in learning and only local information in application decision-making. The model structure of MADDPG algorithm is designed. Finally, through simulation experiments and comparing with deep deterministic policy gradient (DDPG) algorithm, it is verified that the MADDPG algorithm proposed in this paper can greatly improve the learning speed on the basis of ensuring the accuracy, and make up for the shortcomings of the traditional reinforcement learning algorithm in the field of multiple agents.

AB - Aiming at the problem that the traditional optimization algorithm is difficult to get the desired results in a short time in the research of multi-UAV (unmanned aerial vehicle) task decision-making method, this paper proposes a multi-agent deep deterministic policy gradient (MADDPG) algorithm based on deep reinforcement learning. It allows UAVs to use global information in learning and only local information in application decision-making. The model structure of MADDPG algorithm is designed. Finally, through simulation experiments and comparing with deep deterministic policy gradient (DDPG) algorithm, it is verified that the MADDPG algorithm proposed in this paper can greatly improve the learning speed on the basis of ensuring the accuracy, and make up for the shortcomings of the traditional reinforcement learning algorithm in the field of multiple agents.

KW - Deep reinforcement learning

KW - Multi-agent

KW - Policy gradient

KW - Task decision-making

KW - UAV

UR - http://www.scopus.com/inward/record.url?scp=85112058682&partnerID=8YFLogxK

U2 - 10.3873/j.issn.1000-1328.2021.06.009

DO - 10.3873/j.issn.1000-1328.2021.06.009

M3 - 文章

AN - SCOPUS:85112058682

SN - 1000-1328

VL - 42

SP - 757

EP - 765

JO - Yuhang Xuebao/Journal of Astronautics

JF - Yuhang Xuebao/Journal of Astronautics

IS - 6

ER -

基于MADDPG的多无人机协同任务决策

摘要

关键词

访问文件

其它文件与链接

指纹

引用此