Abstract
Many works provide intrinsic rewards to deal with sparse rewards in reinforcement learning. Due to the non-stationarity of multi-agent systems, it is impracticable to apply existing methods to multi-agent reinforcement learning directly. In this paper, a fuzzy curiosity-driven mechanism is proposed for multi-agent reinforcement learning, by which agents can explore more efficiently in a scenario with sparse extrinsic reward. First, we improve the variational auto-encoder to predict the next state through the joint-state and joint-action for agents. Then several fuzzy partitions are built according to the next joint-state, which aims at assigning the prediction error to different agents. With the proposed method, each agent in the multi-agent environment receives its individual intrinsic reward. We elaborate on the proposed method in partially observable environments and fully observable environments separately. Experimental results show that multi-agent learns joint policies more efficiently by the proposed fuzzy curiosity-driven mechanism, and it can also help agents find better policies in the training process.
Original language | English |
---|---|
Pages (from-to) | 1222-1233 |
Number of pages | 12 |
Journal | International Journal of Fuzzy Systems |
Volume | 23 |
Issue number | 5 |
DOIs | |
State | Published - Jul 2021 |
Externally published | Yes |
Keywords
- CURIOSITY-driven
- Multi-agent system
- Reinforcement learning