基 于 逆 强 化 学 习 的 航 天 器 交 会 对 接 方 法

Chenglei Yue; Xuechuan Wang; Xiaokui Yue; Ting Song

doi:10.7527/S1000-6893.2023.28420

基于逆强化学习的航天器交会对接方法

Chenglei Yue, Xuechuan Wang, Xiaokui Yue, Ting Song

航天学院

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the ex- pert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nomi- nal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse rein- forcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.

投稿的翻译标题	A spacecraft rendezvous and docking method based on inverse reinforcement learning
源语言	繁体中文
文章编号	328420
期刊	Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica
卷	44
期	19
DOI	https://doi.org/10.7527/S1000-6893.2023.28420
出版状态	已出版 - 15 10月 2023

关键词

generative adversarial inverse reinforcement learning
imitation learning
model predictive control
network training
neural network

访问文件

10.7527/S1000-6893.2023.28420

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{3c49a8290c8047d680101734a45bb6f6,

title = "基于逆强化学习的航天器交会对接方法",

abstract = "For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the ex- pert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nomi- nal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse rein- forcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.",

keywords = "generative adversarial inverse reinforcement learning, imitation learning, model predictive control, network training, neural network",

author = "Chenglei Yue and Xuechuan Wang and Xiaokui Yue and Ting Song",

year = "2023",

month = oct,

day = "15",

doi = "10.7527/S1000-6893.2023.28420",

language = "繁体中文",

volume = "44",

journal = "Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica",

issn = "1000-6893",

publisher = "AAAS Press of Chinese Society of Aeronautics and Astronautics",

number = "19",

}

TY - JOUR

T1 - 基于逆强化学习的航天器交会对接方法

AU - Yue, Chenglei

AU - Wang, Xuechuan

AU - Yue, Xiaokui

AU - Song, Ting

PY - 2023/10/15

Y1 - 2023/10/15

N2 - For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the ex- pert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nomi- nal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse rein- forcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.

AB - For spacecraft proximity maneuvering and rendezvous, a method for training neural networks based on generative adversarial inverse reinforcement learning is proposed by using model predictive control to provide the ex- pert dataset. Firstly, considering the maximum velocity constraint, the control input saturation constraint and the space cone constraint, the dynamics of the chaser spacecraft approaching a static target is established. Then, the chaser spacecraft is driven to reach the target using model predictive control. Secondly, disturbances are added to the nomi- nal trajectory, and the trajectories from each starting positions to the target are calculated using the aforementioned method. The state and command of trajectories at each time are collected to form a training set. Finally, the network structure and parameters are set, and hyperparameters are trained. Driven by the training set, the adversarial inverse reinforcement learning method is used to train the network. The simulation results show that adversarial inverse rein- forcement learning can imitate the behavior of expert trajectories, and successfully train the neural network to drive the spacecraft to move from the starting point to the static target.

KW - generative adversarial inverse reinforcement learning

KW - imitation learning

KW - model predictive control

KW - network training

KW - neural network

UR - http://www.scopus.com/inward/record.url?scp=85180410507&partnerID=8YFLogxK

U2 - 10.7527/S1000-6893.2023.28420

DO - 10.7527/S1000-6893.2023.28420

M3 - 文章

AN - SCOPUS:85180410507

SN - 1000-6893

VL - 44

JO - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica

JF - Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica

IS - 19

M1 - 328420

ER -

基 于 逆 强 化 学 习 的 航 天 器 交 会 对 接 方 法

摘要

关键词

访问文件

其它文件与链接

指纹

引用此

基于逆强化学习的航天器交会对接方法