TY - JOUR
T1 - 基于启发强化学习的大规模ADR任务优化方法
AU - Yang, Jianan
AU - Hou, Xiaolei
AU - Hu, Yu Hen
AU - Liu, Yong
AU - Pan, Quan
AU - Feng, Qian
N1 - Publisher Copyright:
© 2021, Beihang University Aerospace Knowledge Press. All right reserved.
PY - 2021/4/25
Y1 - 2021/4/25
N2 - Vigorous development of the space industry leads to a nonnegligible space debris threat to future space activities. The Active multi-Debris Removal (ADR) technology has become an indispensable means to alleviate this situation. Aiming at the large-scale multi-debris active removal mission planning problem, a Reinforcement Learning (RL) planning scheme is first proposed based on the maximal-reward optimization model for the ADR problem, and the state, action, and reward function of this problem are defined according to the RL framework. Based on an efficient heuristics method, a specialized Monte Carlo Tree Search (MCTS) algorithm is then presented, with the Monte Carlo Tree Search as the core structure and efficient heuristic operators and reinforcement learning iteration process. Finally, its effectiveness is tested in the large-scale complete Iridium 33 debris cloud. The results show that this method is superior to the original MCTS algorithm and the heuristic greedy algorithm.
AB - Vigorous development of the space industry leads to a nonnegligible space debris threat to future space activities. The Active multi-Debris Removal (ADR) technology has become an indispensable means to alleviate this situation. Aiming at the large-scale multi-debris active removal mission planning problem, a Reinforcement Learning (RL) planning scheme is first proposed based on the maximal-reward optimization model for the ADR problem, and the state, action, and reward function of this problem are defined according to the RL framework. Based on an efficient heuristics method, a specialized Monte Carlo Tree Search (MCTS) algorithm is then presented, with the Monte Carlo Tree Search as the core structure and efficient heuristic operators and reinforcement learning iteration process. Finally, its effectiveness is tested in the large-scale complete Iridium 33 debris cloud. The results show that this method is superior to the original MCTS algorithm and the heuristic greedy algorithm.
KW - Active debris removal
KW - Heuristic operator
KW - Mission planning
KW - Monte Carlo tree search
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85105728624&partnerID=8YFLogxK
U2 - 10.7527/S1000-6893.2020.24354
DO - 10.7527/S1000-6893.2020.24354
M3 - 文章
AN - SCOPUS:85105728624
SN - 1000-6893
VL - 42
JO - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
JF - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
IS - 4
M1 - 524354
ER -