Expert System-Based Multiagent Deep Deterministic Policy Gradient for Swarm Robot Decision Making

Zhen Wang; Xiaoyue Jin; Tao Zhang; Jiahao Li; Dengxiu Yu; Kang Hao Cheong; C. L.Philip Chen

doi:10.1109/TCYB.2022.3228578

Expert System-Based Multiagent Deep Deterministic Policy Gradient for Swarm Robot Decision Making

Zhen Wang, Xiaoyue Jin, Tao Zhang, Jiahao Li, Dengxiu Yu, Kang Hao Cheong, C. L.Philip Chen

科研成果: 期刊稿件 › 文章 › 同行评审

17 引用（Scopus）

摘要

In this article, an expert system-based multiagent deep deterministic policy gradient (ESB-MADDPG) is proposed to realize the decision making for swarm robots. Multiagent deep deterministic policy gradient (MADDPG) is a multiagent reinforcement learning algorithm proposed to utilize a centralized critic within the actor-critic learning framework, which can reduce policy gradient variance. However, it is difficult to apply traditional MADDPG to swarm robots directly as it is time consuming during the path planning, rendering it necessary to propose a faster method to gather the trajectories. Besides, the trajectories obtained by the MADDPG are continuous by straight lines, which is not smooth and will be difficult for the swarm robots to track. This article aims to solve these problems by closing the above gaps. First, the ESB-MADDPG method is proposed to improve the training speed. The smooth processing of the trajectory is designed in the ESB-MADDPG. Furthermore, the expert system also provides us with many trained offline trajectories, which avoid the retraining each time we use the swarm robots. Considering the gathered trajectories, the model predictive control (MPC) algorithm is introduced to realize the optimal tracking of the offline trajectories. Simulation results show that combining ESB-MADDPG and MPC can realize swarm robot decision making efficiently.

源语言	英语
页（从-至）	1614-1624
页数	11
期刊	IEEE Transactions on Cybernetics
卷	54
期	3
DOI	https://doi.org/10.1109/TCYB.2022.3228578
出版状态	已出版 - 1 3月 2024

访问文件

10.1109/TCYB.2022.3228578

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{644ccade0cc944389918a3b6cdfddee1,

title = "Expert System-Based Multiagent Deep Deterministic Policy Gradient for Swarm Robot Decision Making",

abstract = "In this article, an expert system-based multiagent deep deterministic policy gradient (ESB-MADDPG) is proposed to realize the decision making for swarm robots. Multiagent deep deterministic policy gradient (MADDPG) is a multiagent reinforcement learning algorithm proposed to utilize a centralized critic within the actor-critic learning framework, which can reduce policy gradient variance. However, it is difficult to apply traditional MADDPG to swarm robots directly as it is time consuming during the path planning, rendering it necessary to propose a faster method to gather the trajectories. Besides, the trajectories obtained by the MADDPG are continuous by straight lines, which is not smooth and will be difficult for the swarm robots to track. This article aims to solve these problems by closing the above gaps. First, the ESB-MADDPG method is proposed to improve the training speed. The smooth processing of the trajectory is designed in the ESB-MADDPG. Furthermore, the expert system also provides us with many trained offline trajectories, which avoid the retraining each time we use the swarm robots. Considering the gathered trajectories, the model predictive control (MPC) algorithm is introduced to realize the optimal tracking of the offline trajectories. Simulation results show that combining ESB-MADDPG and MPC can realize swarm robot decision making efficiently.",

keywords = "Model prediction control, multiagent deep deterministic policy gradient (MADDPG), swarm robot decision making",

author = "Zhen Wang and Xiaoyue Jin and Tao Zhang and Jiahao Li and Dengxiu Yu and Cheong, {Kang Hao} and Chen, {C. L.Philip}",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2024",

month = mar,

day = "1",

doi = "10.1109/TCYB.2022.3228578",

language = "英语",

volume = "54",

pages = "1614--1624",

journal = "IEEE Transactions on Cybernetics",

issn = "2168-2267",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

TY - JOUR

T1 - Expert System-Based Multiagent Deep Deterministic Policy Gradient for Swarm Robot Decision Making

AU - Wang, Zhen

AU - Jin, Xiaoyue

AU - Zhang, Tao

AU - Li, Jiahao

AU - Yu, Dengxiu

AU - Cheong, Kang Hao

AU - Chen, C. L.Philip

PY - 2024/3/1

Y1 - 2024/3/1

N2 - In this article, an expert system-based multiagent deep deterministic policy gradient (ESB-MADDPG) is proposed to realize the decision making for swarm robots. Multiagent deep deterministic policy gradient (MADDPG) is a multiagent reinforcement learning algorithm proposed to utilize a centralized critic within the actor-critic learning framework, which can reduce policy gradient variance. However, it is difficult to apply traditional MADDPG to swarm robots directly as it is time consuming during the path planning, rendering it necessary to propose a faster method to gather the trajectories. Besides, the trajectories obtained by the MADDPG are continuous by straight lines, which is not smooth and will be difficult for the swarm robots to track. This article aims to solve these problems by closing the above gaps. First, the ESB-MADDPG method is proposed to improve the training speed. The smooth processing of the trajectory is designed in the ESB-MADDPG. Furthermore, the expert system also provides us with many trained offline trajectories, which avoid the retraining each time we use the swarm robots. Considering the gathered trajectories, the model predictive control (MPC) algorithm is introduced to realize the optimal tracking of the offline trajectories. Simulation results show that combining ESB-MADDPG and MPC can realize swarm robot decision making efficiently.

AB - In this article, an expert system-based multiagent deep deterministic policy gradient (ESB-MADDPG) is proposed to realize the decision making for swarm robots. Multiagent deep deterministic policy gradient (MADDPG) is a multiagent reinforcement learning algorithm proposed to utilize a centralized critic within the actor-critic learning framework, which can reduce policy gradient variance. However, it is difficult to apply traditional MADDPG to swarm robots directly as it is time consuming during the path planning, rendering it necessary to propose a faster method to gather the trajectories. Besides, the trajectories obtained by the MADDPG are continuous by straight lines, which is not smooth and will be difficult for the swarm robots to track. This article aims to solve these problems by closing the above gaps. First, the ESB-MADDPG method is proposed to improve the training speed. The smooth processing of the trajectory is designed in the ESB-MADDPG. Furthermore, the expert system also provides us with many trained offline trajectories, which avoid the retraining each time we use the swarm robots. Considering the gathered trajectories, the model predictive control (MPC) algorithm is introduced to realize the optimal tracking of the offline trajectories. Simulation results show that combining ESB-MADDPG and MPC can realize swarm robot decision making efficiently.

KW - Model prediction control

KW - multiagent deep deterministic policy gradient (MADDPG)

KW - swarm robot decision making

UR - http://www.scopus.com/inward/record.url?scp=85146249164&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2022.3228578

DO - 10.1109/TCYB.2022.3228578

M3 - 文章

C2 - 37015659

AN - SCOPUS:85146249164

SN - 2168-2267

VL - 54

SP - 1614

EP - 1624

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

IS - 3

ER -

Expert System-Based Multiagent Deep Deterministic Policy Gradient for Swarm Robot Decision Making

摘要

访问文件

其它文件与链接

指纹

引用此