基于启发强化学习的大规模ADR任务优化方法

Jianan Yang; Xiaolei Hou; Yu Hen Hu; Yong Liu; Quan Pan; Qian Feng

doi:10.7527/S1000-6893.2020.24354

基于启发强化学习的大规模ADR任务优化方法

Jianan Yang, Xiaolei Hou, Yu Hen Hu, Yong Liu, Quan Pan, Qian Feng

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

4 引用（Scopus）

摘要

Vigorous development of the space industry leads to a nonnegligible space debris threat to future space activities. The Active multi-Debris Removal (ADR) technology has become an indispensable means to alleviate this situation. Aiming at the large-scale multi-debris active removal mission planning problem, a Reinforcement Learning (RL) planning scheme is first proposed based on the maximal-reward optimization model for the ADR problem, and the state, action, and reward function of this problem are defined according to the RL framework. Based on an efficient heuristics method, a specialized Monte Carlo Tree Search (MCTS) algorithm is then presented, with the Monte Carlo Tree Search as the core structure and efficient heuristic operators and reinforcement learning iteration process. Finally, its effectiveness is tested in the large-scale complete Iridium 33 debris cloud. The results show that this method is superior to the original MCTS algorithm and the heuristic greedy algorithm.

投稿的翻译标题	Heuristic enhanced reinforcement learning method for large-scale multi-debris active removal mission planning
源语言	繁体中文
文章编号	524354
期刊	Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
卷	42
期	4
DOI	https://doi.org/10.7527/S1000-6893.2020.24354
出版状态	已出版 - 25 4月 2021

关键词

Active debris removal
Heuristic operator
Mission planning
Monte Carlo tree search
Reinforcement learning

访问文件

10.7527/S1000-6893.2020.24354

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{505495a5f4a54dfcb2c5cd6b3e827945,

title = "基于启发强化学习的大规模ADR任务优化方法",

abstract = "Vigorous development of the space industry leads to a nonnegligible space debris threat to future space activities. The Active multi-Debris Removal (ADR) technology has become an indispensable means to alleviate this situation. Aiming at the large-scale multi-debris active removal mission planning problem, a Reinforcement Learning (RL) planning scheme is first proposed based on the maximal-reward optimization model for the ADR problem, and the state, action, and reward function of this problem are defined according to the RL framework. Based on an efficient heuristics method, a specialized Monte Carlo Tree Search (MCTS) algorithm is then presented, with the Monte Carlo Tree Search as the core structure and efficient heuristic operators and reinforcement learning iteration process. Finally, its effectiveness is tested in the large-scale complete Iridium 33 debris cloud. The results show that this method is superior to the original MCTS algorithm and the heuristic greedy algorithm.",

keywords = "Active debris removal, Heuristic operator, Mission planning, Monte Carlo tree search, Reinforcement learning",

author = "Jianan Yang and Xiaolei Hou and Hu, {Yu Hen} and Yong Liu and Quan Pan and Qian Feng",

year = "2021",

month = apr,

day = "25",

doi = "10.7527/S1000-6893.2020.24354",

language = "繁体中文",

volume = "42",

journal = "Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica",

issn = "1000-6893",

publisher = "AAAS Press of Chinese Society of Aeronautics and Astronautics",

number = "4",

}

TY - JOUR

T1 - 基于启发强化学习的大规模ADR任务优化方法

AU - Yang, Jianan

AU - Hou, Xiaolei

AU - Hu, Yu Hen

AU - Liu, Yong

AU - Pan, Quan

AU - Feng, Qian

PY - 2021/4/25

Y1 - 2021/4/25

N2 - Vigorous development of the space industry leads to a nonnegligible space debris threat to future space activities. The Active multi-Debris Removal (ADR) technology has become an indispensable means to alleviate this situation. Aiming at the large-scale multi-debris active removal mission planning problem, a Reinforcement Learning (RL) planning scheme is first proposed based on the maximal-reward optimization model for the ADR problem, and the state, action, and reward function of this problem are defined according to the RL framework. Based on an efficient heuristics method, a specialized Monte Carlo Tree Search (MCTS) algorithm is then presented, with the Monte Carlo Tree Search as the core structure and efficient heuristic operators and reinforcement learning iteration process. Finally, its effectiveness is tested in the large-scale complete Iridium 33 debris cloud. The results show that this method is superior to the original MCTS algorithm and the heuristic greedy algorithm.

AB - Vigorous development of the space industry leads to a nonnegligible space debris threat to future space activities. The Active multi-Debris Removal (ADR) technology has become an indispensable means to alleviate this situation. Aiming at the large-scale multi-debris active removal mission planning problem, a Reinforcement Learning (RL) planning scheme is first proposed based on the maximal-reward optimization model for the ADR problem, and the state, action, and reward function of this problem are defined according to the RL framework. Based on an efficient heuristics method, a specialized Monte Carlo Tree Search (MCTS) algorithm is then presented, with the Monte Carlo Tree Search as the core structure and efficient heuristic operators and reinforcement learning iteration process. Finally, its effectiveness is tested in the large-scale complete Iridium 33 debris cloud. The results show that this method is superior to the original MCTS algorithm and the heuristic greedy algorithm.

KW - Active debris removal

KW - Heuristic operator

KW - Mission planning

KW - Monte Carlo tree search

KW - Reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85105728624&partnerID=8YFLogxK

U2 - 10.7527/S1000-6893.2020.24354

DO - 10.7527/S1000-6893.2020.24354

M3 - 文章

AN - SCOPUS:85105728624

SN - 1000-6893

VL - 42

JO - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica

JF - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica

IS - 4

M1 - 524354

ER -

基于启发强化学习的大规模ADR任务优化方法

摘要

关键词

访问文件

其它文件与链接

指纹

引用此