TY - JOUR
T1 - 基 于 逆 强 化 学 习 的 航 天 器 交 会 对 接 方 法
AU - Yue, Chenglei
AU - Wang, Xuechuan
AU - Yue, Xiaokui
AU - Song, Ting
N1 - Publisher Copyright:
© 2023 AAAS Press of Chinese Society of Aeronautics and Astronautics. All rights reserved.
PY - 2023/10/15
Y1 - 2023/10/15
N2 - For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the ex- pert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nomi- nal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse rein- forcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.
AB - For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the ex- pert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nomi- nal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse rein- forcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.
KW - generative adversarial inverse reinforcement learning
KW - imitation learning
KW - model predictive control
KW - network training
KW - neural network
UR - http://www.scopus.com/inward/record.url?scp=85180410507&partnerID=8YFLogxK
U2 - 10.7527/S1000-6893.2023.28420
DO - 10.7527/S1000-6893.2023.28420
M3 - 文章
AN - SCOPUS:85180410507
SN - 1000-6893
VL - 44
JO - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
JF - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
IS - 19
M1 - 328420
ER -