A Dynamic Adjusting Reward Function Method for Deep Reinforcement Learning with Adjustable Parameters

Zijian Hu, Kaifang Wan, Xiaoguang Gao, Yiwei Zhai

科研成果: 期刊稿件文章同行评审

28 引用 (Scopus)

摘要

In deep reinforcement learning, network convergence speed is often slow and easily converges to local optimal solutions. For an environment with reward saltation, we propose a magnify saltatory reward (MSR) algorithm with variable parameters from the perspective of sample usage. MSR dynamically adjusts the rewards for experience with reward saltation in the experience pool, thereby increasing an agent's utilization of these experiences. We conducted experiments in a simulated obstacle avoidance search environment of an unmanned aerial vehicle and compared the experimental results of deep Q-network (DQN), double DQN, and dueling DQN after adding MSR. The experimental results demonstrate that, after adding MSR, the algorithms exhibit a faster network convergence and can obtain the global optimal solution easily.

源语言英语
文章编号7619483
期刊Mathematical Problems in Engineering
2019
DOI
出版状态已出版 - 2019

指纹

探究 'A Dynamic Adjusting Reward Function Method for Deep Reinforcement Learning with Adjustable Parameters' 的科研主题。它们共同构成独一无二的指纹。

引用此