An Improved Prioritized DDPG Based on Fractional-Order Learning Scheme

Quan Yong Fan, Meiying Cai, Bin Xu

科研成果: 期刊稿件文章同行评审

12 引用 (Scopus)

摘要

Although deep deterministic policy gradient (DDPG) algorithm gets widespread attention as a result of its powerful functionality and applicability for large-scale continuous control, it cannot be denied that DDPG has problems such as low sample utilization efficiency and insufficient exploration. Therefore, an improved DDPG is presented to overcome these challenges in this article. Firstly, an optimizer based on fractional gradient is introduced into the algorithm network, which is conductive to increase the speed and accuracy of training convergence. On this basis, high-value experience replay based on weight-changed priority is proposed to improve sample utilization efficiency, and aiming to have a stronger exploration of the environment, an optimized exploration strategy for boundary action space is adopted. Finally, our proposed method is tested through the experiments of gym and pybullet platform. According to the results, our method speeds up the learning process, obtains higher average rewards in comparison with other algorithms.

源语言英语
页(从-至)6873-6882
页数10
期刊IEEE Transactions on Neural Networks and Learning Systems
36
4
DOI
出版状态已出版 - 2025

指纹

探究 'An Improved Prioritized DDPG Based on Fractional-Order Learning Scheme' 的科研主题。它们共同构成独一无二的指纹。

引用此