An improved reinforcement Q-learning method with BP neural networks in robot soccer

Shi Chao Wang; Zheng Xi Song; Hao Ding; Hao Bin Shi

doi:10.1109/ISCID.2011.53

An improved reinforcement Q-learning method with BP neural networks in robot soccer

Shi Chao Wang, Zheng Xi Song, Hao Ding, Hao Bin Shi

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

9 Scopus citations

Abstract

In traditional reinforcement Q-Learning method, there exists two problems: difficulty of dividing the state information, complexity of extreme large dimension input. To solve these two problems, this paper proposed an improved reinforcement Q-Learning method with BP neutral network. In this method, the large Q table is replaced by a BP neural network. Continuous environmental information is the input. The Q value is the output. The Q value and weight of the network are also adjusted by the action rewards. This paper presents an algorithm for single agent's action selection. Simulation shows proposed method is more stable and applicable for the agent's strategy selection.

Original language	English
Title of host publication	Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011
Pages	177-180
Number of pages	4
DOIs	https://doi.org/10.1109/ISCID.2011.53
State	Published - 2011
Event	2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011 - Hangzhou, China Duration: 28 Oct 2011 → 30 Oct 2011

Publication series

Name	Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011
Volume	1

Conference

Conference	2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011
Country/Territory	China
City	Hangzhou
Period	28/10/11 → 30/10/11

Keywords

BP Neural Networks
Reinforcement Q-Learning
Robot Soccer

Access to Document

10.1109/ISCID.2011.53

Cite this

Wang, S. C., Song, Z. X., Ding, H., & Shi, H. B. (2011). An improved reinforcement Q-learning method with BP neural networks in robot soccer. In Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011 (pp. 177-180). Article 6079665 (Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011; Vol. 1). https://doi.org/10.1109/ISCID.2011.53

@inproceedings{5d150698b7ea44cc85fe5a0f9775593b,

title = "An improved reinforcement Q-learning method with BP neural networks in robot soccer",

abstract = "In traditional reinforcement Q-Learning method, there exists two problems: difficulty of dividing the state information, complexity of extreme large dimension input. To solve these two problems, this paper proposed an improved reinforcement Q-Learning method with BP neutral network. In this method, the large Q table is replaced by a BP neural network. Continuous environmental information is the input. The Q value is the output. The Q value and weight of the network are also adjusted by the action rewards. This paper presents an algorithm for single agent's action selection. Simulation shows proposed method is more stable and applicable for the agent's strategy selection.",

keywords = "BP Neural Networks, Reinforcement Q-Learning, Robot Soccer",

author = "Wang, {Shi Chao} and Song, {Zheng Xi} and Hao Ding and Shi, {Hao Bin}",

year = "2011",

doi = "10.1109/ISCID.2011.53",

language = "英语",

isbn = "9780769545004",

series = "Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011",

pages = "177--180",

booktitle = "Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011",

note = "2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011 ; Conference date: 28-10-2011 Through 30-10-2011",

}

Wang, SC, Song, ZX, Ding, H & Shi, HB 2011, An improved reinforcement Q-learning method with BP neural networks in robot soccer. in Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011., 6079665, Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011, vol. 1, pp. 177-180, 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011, Hangzhou, China, 28/10/11. https://doi.org/10.1109/ISCID.2011.53

An improved reinforcement Q-learning method with BP neural networks in robot soccer. / Wang, Shi Chao; Song, Zheng Xi; Ding, Hao et al.
Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011. 2011. p. 177-180 6079665 (Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011; Vol. 1).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - An improved reinforcement Q-learning method with BP neural networks in robot soccer

AU - Wang, Shi Chao

AU - Song, Zheng Xi

AU - Ding, Hao

AU - Shi, Hao Bin

PY - 2011

Y1 - 2011

N2 - In traditional reinforcement Q-Learning method, there exists two problems: difficulty of dividing the state information, complexity of extreme large dimension input. To solve these two problems, this paper proposed an improved reinforcement Q-Learning method with BP neutral network. In this method, the large Q table is replaced by a BP neural network. Continuous environmental information is the input. The Q value is the output. The Q value and weight of the network are also adjusted by the action rewards. This paper presents an algorithm for single agent's action selection. Simulation shows proposed method is more stable and applicable for the agent's strategy selection.

AB - In traditional reinforcement Q-Learning method, there exists two problems: difficulty of dividing the state information, complexity of extreme large dimension input. To solve these two problems, this paper proposed an improved reinforcement Q-Learning method with BP neutral network. In this method, the large Q table is replaced by a BP neural network. Continuous environmental information is the input. The Q value is the output. The Q value and weight of the network are also adjusted by the action rewards. This paper presents an algorithm for single agent's action selection. Simulation shows proposed method is more stable and applicable for the agent's strategy selection.

KW - BP Neural Networks

KW - Reinforcement Q-Learning

KW - Robot Soccer

UR - http://www.scopus.com/inward/record.url?scp=83455225460&partnerID=8YFLogxK

U2 - 10.1109/ISCID.2011.53

DO - 10.1109/ISCID.2011.53

M3 - 会议稿件

AN - SCOPUS:83455225460

SN - 9780769545004

T3 - Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011

SP - 177

EP - 180

BT - Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011

T2 - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011

Y2 - 28 October 2011 through 30 October 2011

ER -

Wang SC, Song ZX, Ding H, Shi HB. An improved reinforcement Q-learning method with BP neural networks in robot soccer. In Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011. 2011. p. 177-180. 6079665. (Proceedings - 2011 4th International Symposium on Computational Intelligence and Design, ISCID 2011). doi: 10.1109/ISCID.2011.53

An improved reinforcement Q-learning method with BP neural networks in robot soccer

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this