An improved approach towards multi-agent pursuit–evasion game decision-making using deep reinforcement learning

Kaifang Wan; Dingwei Wu; Yiwei Zhai; Bo Li; Xiaoguang Gao; Zijian Hu

doi:10.3390/e23111433

An improved approach towards multi-agent pursuit–evasion game decision-making using deep reinforcement learning

Kaifang Wan, Dingwei Wu, Yiwei Zhai, Bo Li, Xiaoguang Gao, Zijian Hu

电子信息学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

44 引用（Scopus）

摘要

A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.

源语言	英语
文章编号	1433
期刊	Entropy
卷	23
期	11
DOI	https://doi.org/10.3390/e23111433
出版状态	已出版 - 11月 2021

访问文件

10.3390/e23111433

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{e93f098d4b4f4a4fb5571b3e32e0b48d,

title = "An improved approach towards multi-agent pursuit–evasion game decision-making using deep reinforcement learning",

abstract = "A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.",

keywords = "Adversarial learning, Decision-making, Deep reinforcement learning, MADDPG, Multi-agent, Pursuit–evasion",

author = "Kaifang Wan and Dingwei Wu and Yiwei Zhai and Bo Li and Xiaoguang Gao and Zijian Hu",

note = "Publisher Copyright: {\textcopyright} 2021 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2021",

month = nov,

doi = "10.3390/e23111433",

language = "英语",

volume = "23",

journal = "Entropy",

issn = "1099-4300",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "11",

}

TY - JOUR

T1 - An improved approach towards multi-agent pursuit–evasion game decision-making using deep reinforcement learning

AU - Wan, Kaifang

AU - Wu, Dingwei

AU - Zhai, Yiwei

AU - Li, Bo

AU - Gao, Xiaoguang

AU - Hu, Zijian

PY - 2021/11

Y1 - 2021/11

N2 - A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.

AB - A pursuit–evasion game is a classical maneuver confrontation problem in the multi-agent systems (MASs) domain. An online decision technique based on deep reinforcement learning (DRL) was developed in this paper to address the problem of environment sensing and decision-making in pursuit–evasion games. A control-oriented framework developed from the DRL-based multi-agent deep deterministic policy gradient (MADDPG) algorithm was built to implement multi-agent cooperative decision-making to overcome the limitation of the tedious state variables required for the traditionally complicated modeling process. To address the effects of errors between a model and a real scenario, this paper introduces adversarial disturbances. It also proposes a novel adversarial attack trick and adversarial learning MADDPG (A2-MADDPG) algorithm. By introducing an adversarial attack trick for the agents themselves, uncertainties of the real world are modeled, thereby optimizing robust training. During the training process, adversarial learning was incorporated into our algorithm to preprocess the actions of multiple agents, which enabled them to properly respond to uncertain dynamic changes in MASs. Experimental results verified that the proposed approach provides superior performance and effectiveness for pursuers and evaders, and both can learn the corresponding confrontational strategy during training.

KW - Adversarial learning

KW - Decision-making

KW - Deep reinforcement learning

KW - MADDPG

KW - Multi-agent

KW - Pursuit–evasion

UR - http://www.scopus.com/inward/record.url?scp=85118379229&partnerID=8YFLogxK

U2 - 10.3390/e23111433

DO - 10.3390/e23111433

M3 - 文章

AN - SCOPUS:85118379229

SN - 1099-4300

VL - 23

JO - Entropy

JF - Entropy

IS - 11

M1 - 1433

ER -

An improved approach towards multi-agent pursuit–evasion game decision-making using deep reinforcement learning

摘要

访问文件

其它文件与链接

指纹

引用此