A Reinforcement Learning Scheme for Active Multi-Debris Removal Mission Planning with Modified Upper Confidence Bound Tree Search

Jianan Yang, Xiaolei Hou, Yu Hen Hu, Yong Liu, Quan Pan

科研成果: 期刊稿件文章同行评审

12 引用 (Scopus)

摘要

The increasing number of space debris is a critical impact on space environment. Active multi-debris removal (ADR) mission planning technique with maximal reward objective is getting more attention. As the goal of Reinforcement Learning (RL) is in accordance with maximal-reward optimization model of ADR, the planning will be more efficient with the advanced RL scheme and RL algorithm. In this paper, first, an RL formulation is presented for the ADR mission planning problem. All the basic components of maximal-reward optimization model are recast in RL scheme. Second, a modified Upper Confidence bound Tree (UCT) search algorithm for the ADR planning task is developed, which both leverages the neural-network-assisted selection and expansion procedures to facilitate exploration and incorporates roll-out simulation in the backup procedure to achieve robust value estimation. This algorithm fits the RL scheme of ADR mission planning and better balances the exploration and exploitation. Experimental comparison using three subsets of Iridium 33 debris cloud data reveals a better performance of this modified UCT over previously reported results and close UCT variants.

源语言英语
文章编号9113461
页(从-至)108461-108473
页数13
期刊IEEE Access
8
DOI
出版状态已出版 - 2020

指纹

探究 'A Reinforcement Learning Scheme for Active Multi-Debris Removal Mission Planning with Modified Upper Confidence Bound Tree Search' 的科研主题。它们共同构成独一无二的指纹。

引用此