Multi-Agent Reward-Iteration Fuzzy Q-Learning

Lixiong Leng, Jingchen Li, Jinhui Zhu, Kao Shing Hwang, Haobin Shi

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Fuzzy Q-learning extends Q-learning to continuous state space and has been applied to a wide range of applications such as robot control. But in a multi-agent system, the non-stationary environment makes joint policy challenging to converge. To give agents more suitable rewards in a multi-agent environment, a multi-agent reward-iteration fuzzy Q-learning (RIFQ) is proposed for multi-agent cooperative tasks. The state space is divided into three channels by the proposed state-divider with fuzzy logic. The reward of an agent is reshaped iteratively according to its state, and the update sequence is constructed by calculating the relation among states of different agents. Then, the value functions are updated top-down. By replacing the reward given by the environment with the reshaped reward, agents can avoid the most unreasonable punishments and receive rewards selectively. RIFQ provides a feasible reward relationship for multi-agents, which makes the training of multi-agent more steady. Several simulation experiments show that RIFQ is not limited by the number of agents and has a faster convergence speed than baselines.

Original languageEnglish
Pages (from-to)1669-1679
Number of pages11
JournalInternational Journal of Fuzzy Systems
Volume23
Issue number6
DOIs
StatePublished - Sep 2021

Keywords

  • Fuzzy Q-learning
  • Multi-agent reinforcement learning
  • Multi-agent system
  • Reward shaping

Fingerprint

Dive into the research topics of 'Multi-Agent Reward-Iteration Fuzzy Q-Learning'. Together they form a unique fingerprint.

Cite this