TY - JOUR
T1 - Smooth Clip Advantage PPO in Reinforcement Learning
AU - Wang, Junwei
AU - Zeng, Zilin
AU - Shang, Peng
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2023
Y1 - 2023
N2 - Deep reinforcement learning outperforms traditional methods in some domains. In this paper, we propose a novel reinforcement learning on policy (RL) algorithm, the Smoothing Clip Advantage Proximal Policy optimization Algorithm (SCAPPO), which extends the classical PPO algorithm where we exploit the smoothing properties of the sigmoid function to make full use of useful gradients. In addition, we provide more efficient gradients for policy networks effective gradients, aiming to solve the overfitting problem caused by the coupling of strategy and value functions. SCAPPO outperforms currently popular reinforcement learning algorithms in performance tasks in the Open AI Gym.
AB - Deep reinforcement learning outperforms traditional methods in some domains. In this paper, we propose a novel reinforcement learning on policy (RL) algorithm, the Smoothing Clip Advantage Proximal Policy optimization Algorithm (SCAPPO), which extends the classical PPO algorithm where we exploit the smoothing properties of the sigmoid function to make full use of useful gradients. In addition, we provide more efficient gradients for policy networks effective gradients, aiming to solve the overfitting problem caused by the coupling of strategy and value functions. SCAPPO outperforms currently popular reinforcement learning algorithms in performance tasks in the Open AI Gym.
UR - http://www.scopus.com/inward/record.url?scp=85166672354&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/2513/1/012005
DO - 10.1088/1742-6596/2513/1/012005
M3 - 会议文章
AN - SCOPUS:85166672354
SN - 1742-6588
VL - 2513
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
IS - 1
M1 - 012005
T2 - 2023 7th International Conference on Artificial Intelligence, Automation and Control Technologies, AIACT 2023
Y2 - 24 February 2023 through 26 February 2023
ER -