Twin Delayed Multi-Agent Deep Deterministic Policy Gradient

Mengying Zhan, Jinchao Chen, Chenglie Du, Yuxin Duan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Recently, reinforcement learning has made remarkable achievements in the fields of natural science, engineering, medicine and operational research. Reinforcement learning addresses sequence problems and considers long-term returns. This long-term view of reinforcement learning is critical to find the optimal solution of many problems. The existing multi- agent reinforcement learning algorithms have the problem of overestimation in estimating the Q value. Unfortunately, there have not been many studies on overestimation of agent reinforcement learning, which will affect the learning efficiency of reinforcement learning. Based on the traditional multi-agent reinforcement learning algorithm, this paper improves the actor network and critic network, optimizes the overestimation of Q value and adopts the update delayed method to make the actor training more stable. In order to test the effectiveness of the algorithm structure, the modified method is compared with the traditional MADDPG, DDPG and DQN methods in the simulation environment.

Original languageEnglish
Title of host publicationProceedings of the 2021 IEEE International Conference on Progress in Informatics and Computing, PIC 2021
EditorsYinglin Wang, Zheying Zhang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages48-52
Number of pages5
ISBN (Electronic)9781665426558
DOIs
StatePublished - 2021
Event8th IEEE International Conference on Progress in Informatics and Computing, PIC 2021 - Virtual, Online, China
Duration: 17 Dec 202119 Dec 2021

Publication series

NameProceedings of the 2021 IEEE International Conference on Progress in Informatics and Computing, PIC 2021

Conference

Conference8th IEEE International Conference on Progress in Informatics and Computing, PIC 2021
Country/TerritoryChina
CityVirtual, Online
Period17/12/2119/12/21

Keywords

  • Deep learning
  • multi-agent system
  • neural networks
  • overestimation
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Twin Delayed Multi-Agent Deep Deterministic Policy Gradient'. Together they form a unique fingerprint.

Cite this