摘要
As a machine learning method that does not need to obtain training data in advance, reinforcement learning (RL) is an important method to solve the sequential decision-making problem by finding the optimal strategy in the continuous interaction between the agent and the environment. Through the combination of deep learning (DL), deep reinforcement learning (DRL) has both powerful perception and decision-making capabilities, and is widely used in many fields to solve complex decision-making problems. Off-policy reinforcement learning separates exploration and utilization by storing and replaying interactive experience, making it easier to find the global optimal solution. How to make reasonable and efficient use of experience is the key to improve the efficiency of off-policy reinforcement learning methods. First, this paper introduces the basic theory of reinforcement learning. Then, the on-policy and off-policy reinforcement learning algorithms are briefly introduced. Next, two mainstream solutions of experience replay (ER) problem are introduced, including experience utilization and experience expansion. Finally, the relevant research work is summarized and prospected.
投稿的翻译标题 | Research on Experience Replay of Off-policy Deep Reinforcement Learning: A Review |
---|---|
源语言 | 繁体中文 |
页(从-至) | 2237-2256 |
页数 | 20 |
期刊 | Zidonghua Xuebao/Acta Automatica Sinica |
卷 | 49 |
期 | 11 |
DOI | |
出版状态 | 已出版 - 11月 2023 |
关键词
- artificial intelligence
- Deep reinforcement learning (DRL)
- experience replay (ER)
- off-policy