TY - GEN
T1 - Dueling Network Architecture for Multi-Agent Deep Deterministic Policy Gradient
AU - Zhan, Mengying
AU - Chen, Jinchao
AU - Du, Chenglie
AU - Xu, Yongqiang
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/8/13
Y1 - 2021/8/13
N2 - Recently, reinforcement learning has made remarkable achievements in the fields of natural science, engineering, medicine and operational research. Reinforcement learning addresses sequence problems and considers long-term returns. This long-term view of reinforcement learning is critical to find the optimal solution of many problems. The existing multi-agent reinforcement learning methods usually update the value function of state action slowly, and the reward value of agents is low. This paper presents a Dueling Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method based on MADDPG, which modifies critic's network structure. The main work is to add two subnetworks behind the critic network of the traditional MADDPG method. This method allows the critic network to update its parameters faster and receive higher rewards. Finally, in order to verify the validity of the network structure, the improved framework is compared with the traditional MADDPG, DQN and DDPG methods in the simulation environment.
AB - Recently, reinforcement learning has made remarkable achievements in the fields of natural science, engineering, medicine and operational research. Reinforcement learning addresses sequence problems and considers long-term returns. This long-term view of reinforcement learning is critical to find the optimal solution of many problems. The existing multi-agent reinforcement learning methods usually update the value function of state action slowly, and the reward value of agents is low. This paper presents a Dueling Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method based on MADDPG, which modifies critic's network structure. The main work is to add two subnetworks behind the critic network of the traditional MADDPG method. This method allows the critic network to update its parameters faster and receive higher rewards. Finally, in order to verify the validity of the network structure, the improved framework is compared with the traditional MADDPG, DQN and DDPG methods in the simulation environment.
KW - Deep learning
KW - Dueling network
KW - multi-agent system
KW - neural networks
KW - Reinforcement learning
UR - https://www.scopus.com/pages/publications/85116720615
U2 - 10.1109/CCET52649.2021.9544385
DO - 10.1109/CCET52649.2021.9544385
M3 - 会议稿件
AN - SCOPUS:85116720615
T3 - 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology, CCET 2021
SP - 163
EP - 168
BT - 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology, CCET 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Computer and Communication Engineering Technology, CCET 2021
Y2 - 13 August 2021 through 15 August 2021
ER -