TY - JOUR
T1 - Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits
AU - Yu, Weizhuo
AU - Liu, Chuang
AU - Yue, Xiaokui
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/12
Y1 - 2024/12
N2 - The orbital game theory is a fundamental technology for the cleanup of space debris to improve the safety of useful spacecraft in future, thus, this work develops a decision-making method by reinforcement learning technology to implement the pursuit-evasion game in elliptical orbits. The linearized Tschauner-Hempel equation describes the spacecraft's motion and the problem is formulated by game theory. Subsequently, an impulsive maneuvering model in a complete three-dimensional elliptical orbit is established. Then an algorithm based on deep deterministic policy gradient is designed to solve the optimal strategy for the pursuit-evasion game. For the successful decision of the pursuer, an extensive reward function is designed and improved considering the shortest time, optimal fuel, and collision avoidance. Finally, numerical simulations of a pursuit-evasion mission are performed to demonstrate the effectiveness and superiority of the proposed decision-making algorithm. The game success rate of the algorithm against targets with different maneuvering abilities is verified, which implies that the algorithm can be applied in extended scenarios.
AB - The orbital game theory is a fundamental technology for the cleanup of space debris to improve the safety of useful spacecraft in future, thus, this work develops a decision-making method by reinforcement learning technology to implement the pursuit-evasion game in elliptical orbits. The linearized Tschauner-Hempel equation describes the spacecraft's motion and the problem is formulated by game theory. Subsequently, an impulsive maneuvering model in a complete three-dimensional elliptical orbit is established. Then an algorithm based on deep deterministic policy gradient is designed to solve the optimal strategy for the pursuit-evasion game. For the successful decision of the pursuer, an extensive reward function is designed and improved considering the shortest time, optimal fuel, and collision avoidance. Finally, numerical simulations of a pursuit-evasion mission are performed to demonstrate the effectiveness and superiority of the proposed decision-making algorithm. The game success rate of the algorithm against targets with different maneuvering abilities is verified, which implies that the algorithm can be applied in extended scenarios.
KW - Decision making
KW - Deep deterministic policy gradient
KW - Elliptical orbit
KW - Impulsive maneuver
KW - Pursuit-evasion game
UR - http://www.scopus.com/inward/record.url?scp=85203021983&partnerID=8YFLogxK
U2 - 10.1016/j.conengprac.2024.106072
DO - 10.1016/j.conengprac.2024.106072
M3 - 文章
AN - SCOPUS:85203021983
SN - 0967-0661
VL - 153
JO - Control Engineering Practice
JF - Control Engineering Practice
M1 - 106072
ER -