Multi-Agent Reward-Iteration Fuzzy Q-Learning

Lixiong Leng, Jingchen Li, Jinhui Zhu, Kao Shing Hwang, Haobin Shi

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

Fuzzy Q-learning extends Q-learning to continuous state space and has been applied to a wide range of applications such as robot control. But in a multi-agent system, the non-stationary environment makes joint policy challenging to converge. To give agents more suitable rewards in a multi-agent environment, a multi-agent reward-iteration fuzzy Q-learning (RIFQ) is proposed for multi-agent cooperative tasks. The state space is divided into three channels by the proposed state-divider with fuzzy logic. The reward of an agent is reshaped iteratively according to its state, and the update sequence is constructed by calculating the relation among states of different agents. Then, the value functions are updated top-down. By replacing the reward given by the environment with the reshaped reward, agents can avoid the most unreasonable punishments and receive rewards selectively. RIFQ provides a feasible reward relationship for multi-agents, which makes the training of multi-agent more steady. Several simulation experiments show that RIFQ is not limited by the number of agents and has a faster convergence speed than baselines.

源语言英语
页(从-至)1669-1679
页数11
期刊International Journal of Fuzzy Systems
23
6
DOI
出版状态已出版 - 9月 2021

指纹

探究 'Multi-Agent Reward-Iteration Fuzzy Q-Learning' 的科研主题。它们共同构成独一无二的指纹。

引用此