A multiple-attribute decision-making approach to reinforcement learning

Haobin Shi; Meng Xu

doi:10.1109/TCDS.2019.2924724

A multiple-attribute decision-making approach to reinforcement learning

Haobin Shi, Meng Xu

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

16 Scopus citations

Abstract

In the reinforcement learning (RL) system, one important issue is the tradeoff problem between exploration and exploitation. In this paper, we studied this dilemma and proposed a new approach to solving this problem by multiple-attribute decision making (MADM). The applicability of the proposed method is extended by transfer learning. The method decomposes a task into several subtasks and uses the policies of subtasks trained by RL. The proposed visual MADM method (V-MADM) is based on the state-action values of each subtask to select the action with maximal one. Meanwhile, this paper proposes a transfer learning method using a decay function with decreasing probability such that the prior experiences of the subtasks can be utilized to accelerate the learning rate. Finally, the experiment of robot confrontation and Maze walker is performed to evaluate the learning performance of the proposed method. The experimental results show that fewer training cost is needed to obtain a more effective learning performance.

Original language	English
Article number	8745507
Pages (from-to)	695-708
Number of pages	14
Journal	IEEE Transactions on Cognitive and Developmental Systems
Volume	12
Issue number	4
DOIs	https://doi.org/10.1109/TCDS.2019.2924724
State	Published - Dec 2020

Keywords

Decay function
multiple-attribute decision making (MADM)
reinforcement learning (RL)
transfer learning

Access to Document

10.1109/TCDS.2019.2924724

Cite this

@article{c3a1e59199ba49c7a59b0617a898f8b5,

title = "A multiple-attribute decision-making approach to reinforcement learning",

abstract = "In the reinforcement learning (RL) system, one important issue is the tradeoff problem between exploration and exploitation. In this paper, we studied this dilemma and proposed a new approach to solving this problem by multiple-attribute decision making (MADM). The applicability of the proposed method is extended by transfer learning. The method decomposes a task into several subtasks and uses the policies of subtasks trained by RL. The proposed visual MADM method (V-MADM) is based on the state-action values of each subtask to select the action with maximal one. Meanwhile, this paper proposes a transfer learning method using a decay function with decreasing probability such that the prior experiences of the subtasks can be utilized to accelerate the learning rate. Finally, the experiment of robot confrontation and Maze walker is performed to evaluate the learning performance of the proposed method. The experimental results show that fewer training cost is needed to obtain a more effective learning performance.",

keywords = "Decay function, multiple-attribute decision making (MADM), reinforcement learning (RL), transfer learning",

author = "Haobin Shi and Meng Xu",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.",

year = "2020",

month = dec,

doi = "10.1109/TCDS.2019.2924724",

language = "英语",

volume = "12",

pages = "695--708",

journal = "IEEE Transactions on Cognitive and Developmental Systems",

issn = "2379-8920",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "4",

}

TY - JOUR

T1 - A multiple-attribute decision-making approach to reinforcement learning

AU - Shi, Haobin

AU - Xu, Meng

PY - 2020/12

Y1 - 2020/12

N2 - In the reinforcement learning (RL) system, one important issue is the tradeoff problem between exploration and exploitation. In this paper, we studied this dilemma and proposed a new approach to solving this problem by multiple-attribute decision making (MADM). The applicability of the proposed method is extended by transfer learning. The method decomposes a task into several subtasks and uses the policies of subtasks trained by RL. The proposed visual MADM method (V-MADM) is based on the state-action values of each subtask to select the action with maximal one. Meanwhile, this paper proposes a transfer learning method using a decay function with decreasing probability such that the prior experiences of the subtasks can be utilized to accelerate the learning rate. Finally, the experiment of robot confrontation and Maze walker is performed to evaluate the learning performance of the proposed method. The experimental results show that fewer training cost is needed to obtain a more effective learning performance.

AB - In the reinforcement learning (RL) system, one important issue is the tradeoff problem between exploration and exploitation. In this paper, we studied this dilemma and proposed a new approach to solving this problem by multiple-attribute decision making (MADM). The applicability of the proposed method is extended by transfer learning. The method decomposes a task into several subtasks and uses the policies of subtasks trained by RL. The proposed visual MADM method (V-MADM) is based on the state-action values of each subtask to select the action with maximal one. Meanwhile, this paper proposes a transfer learning method using a decay function with decreasing probability such that the prior experiences of the subtasks can be utilized to accelerate the learning rate. Finally, the experiment of robot confrontation and Maze walker is performed to evaluate the learning performance of the proposed method. The experimental results show that fewer training cost is needed to obtain a more effective learning performance.

KW - Decay function

KW - multiple-attribute decision making (MADM)

KW - reinforcement learning (RL)

KW - transfer learning

UR - http://www.scopus.com/inward/record.url?scp=85068187572&partnerID=8YFLogxK

U2 - 10.1109/TCDS.2019.2924724

DO - 10.1109/TCDS.2019.2924724

M3 - 文章

AN - SCOPUS:85068187572

SN - 2379-8920

VL - 12

SP - 695

EP - 708

JO - IEEE Transactions on Cognitive and Developmental Systems

JF - IEEE Transactions on Cognitive and Developmental Systems

IS - 4

M1 - 8745507

ER -

A multiple-attribute decision-making approach to reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this