Abstract
In the reinforcement learning (RL) system, one important issue is the tradeoff problem between exploration and exploitation. In this paper, we studied this dilemma and proposed a new approach to solving this problem by multiple-attribute decision making (MADM). The applicability of the proposed method is extended by transfer learning. The method decomposes a task into several subtasks and uses the policies of subtasks trained by RL. The proposed visual MADM method (V-MADM) is based on the state-action values of each subtask to select the action with maximal one. Meanwhile, this paper proposes a transfer learning method using a decay function with decreasing probability such that the prior experiences of the subtasks can be utilized to accelerate the learning rate. Finally, the experiment of robot confrontation and Maze walker is performed to evaluate the learning performance of the proposed method. The experimental results show that fewer training cost is needed to obtain a more effective learning performance.
Original language | English |
---|---|
Article number | 8745507 |
Pages (from-to) | 695-708 |
Number of pages | 14 |
Journal | IEEE Transactions on Cognitive and Developmental Systems |
Volume | 12 |
Issue number | 4 |
DOIs | |
State | Published - Dec 2020 |
Keywords
- Decay function
- multiple-attribute decision making (MADM)
- reinforcement learning (RL)
- transfer learning