An improved DDPG reinforcement learning control of underwater gliders for energy optimization

Anyan Jing, Zuocheng Tang, Jian Gao, Guang Pan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

As a novel underw ater vehicle, underw ater gliders are widely used in marine environment exploration. Underwater gliders are designed for long-term and longdistance operation, adaptivity and energy optimization is a critical requirement for controller design. In this paper, the reinforcement learning control is studied for underwater gliders, and the problem of slow learning convergence and unstable learning process of the DDPG reinforcement learning algorithm. The proposed solution is based on the priority experience replay method, which effectively increase the convergence speed and stability of the algorithm is addressed. The gliding control parameters are optimized to reduce the energy consumption is proposed, by using the improved DDPG algorithm and the energy consumption model. In the simulation experiments with an underwater glider, a set of glide parameters is obtained at a given gliding depth.

Original languageEnglish
Title of host publicationProceedings of 2020 3rd International Conference on Unmanned Systems, ICUS 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages621-626
Number of pages6
ISBN (Electronic)9781728180250
DOIs
StatePublished - 27 Nov 2020
Event3rd International Conference on Unmanned Systems, ICUS 2020 - Harbin, China
Duration: 27 Nov 202028 Nov 2020

Publication series

NameProceedings of 2020 3rd International Conference on Unmanned Systems, ICUS 2020

Conference

Conference3rd International Conference on Unmanned Systems, ICUS 2020
Country/TerritoryChina
CityHarbin
Period27/11/2028/11/20

Keywords

  • Deep deterministic policy gradient
  • Glide parameters optimization
  • Prioritized experience replay
  • Reinforcement learning
  • Underwater glider

Fingerprint

Dive into the research topics of 'An improved DDPG reinforcement learning control of underwater gliders for energy optimization'. Together they form a unique fingerprint.

Cite this