TY - GEN
T1 - An improved DDPG reinforcement learning control of underwater gliders for energy optimization
AU - Jing, Anyan
AU - Tang, Zuocheng
AU - Gao, Jian
AU - Pan, Guang
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11/27
Y1 - 2020/11/27
N2 - As a novel underw ater vehicle, underw ater gliders are widely used in marine environment exploration. Underwater gliders are designed for long-term and longdistance operation, adaptivity and energy optimization is a critical requirement for controller design. In this paper, the reinforcement learning control is studied for underwater gliders, and the problem of slow learning convergence and unstable learning process of the DDPG reinforcement learning algorithm. The proposed solution is based on the priority experience replay method, which effectively increase the convergence speed and stability of the algorithm is addressed. The gliding control parameters are optimized to reduce the energy consumption is proposed, by using the improved DDPG algorithm and the energy consumption model. In the simulation experiments with an underwater glider, a set of glide parameters is obtained at a given gliding depth.
AB - As a novel underw ater vehicle, underw ater gliders are widely used in marine environment exploration. Underwater gliders are designed for long-term and longdistance operation, adaptivity and energy optimization is a critical requirement for controller design. In this paper, the reinforcement learning control is studied for underwater gliders, and the problem of slow learning convergence and unstable learning process of the DDPG reinforcement learning algorithm. The proposed solution is based on the priority experience replay method, which effectively increase the convergence speed and stability of the algorithm is addressed. The gliding control parameters are optimized to reduce the energy consumption is proposed, by using the improved DDPG algorithm and the energy consumption model. In the simulation experiments with an underwater glider, a set of glide parameters is obtained at a given gliding depth.
KW - Deep deterministic policy gradient
KW - Glide parameters optimization
KW - Prioritized experience replay
KW - Reinforcement learning
KW - Underwater glider
UR - http://www.scopus.com/inward/record.url?scp=85098997917&partnerID=8YFLogxK
U2 - 10.1109/ICUS50048.2020.9274883
DO - 10.1109/ICUS50048.2020.9274883
M3 - 会议稿件
AN - SCOPUS:85098997917
T3 - Proceedings of 2020 3rd International Conference on Unmanned Systems, ICUS 2020
SP - 621
EP - 626
BT - Proceedings of 2020 3rd International Conference on Unmanned Systems, ICUS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Conference on Unmanned Systems, ICUS 2020
Y2 - 27 November 2020 through 28 November 2020
ER -