Research on online reinforcement learning method based on experience-replay

Ning Hu, Zhijun Ge, Xuanwen Chen, Chunguang Ding, Haobin Shi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

As for standard reinforcement learning, the key is that the agent's next step is directed by the instantaneous and delayed reporting from constant interaction with the environment and trial and error learning. But it makes the convergence rate slower for actual reinforcement learning; at the same time, inconsistency state will occur in the agent learning process. Therefore, it is necessary for the agent to remember what has been learned within the time specified to improve the convergence and robustness of decision making. With regard to the above-mentioned issues, this paper proposes to accelerate the convergence rate of reinforcement learning by using the function approximation ability of neural network and to improve the robustness of reinforcement learning by using the Memory-based Experience-Replay(ER) algorithm. The experimental results show the effectiveness of the proposed method.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Information and Automation, ICIA 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1338-1343
Number of pages6
ISBN (Electronic)9781538680698
DOIs
StatePublished - Aug 2018
Event2018 IEEE International Conference on Information and Automation, ICIA 2018 - Wuyishan, Fujian, China
Duration: 11 Aug 201813 Aug 2018

Publication series

Name2018 IEEE International Conference on Information and Automation, ICIA 2018

Conference

Conference2018 IEEE International Conference on Information and Automation, ICIA 2018
Country/TerritoryChina
CityWuyishan, Fujian
Period11/08/1813/08/18

Keywords

  • Experience-Replay
  • Neural Network
  • Reinforcement Learning

Fingerprint

Dive into the research topics of 'Research on online reinforcement learning method based on experience-replay'. Together they form a unique fingerprint.

Cite this