TY - JOUR
T1 - Optimal safe control of spacecraft attitude in multi-constrained regions based on non-zero-sum game
AU - Tan, Longyu
AU - Luo, Jianjun
AU - Liu, Jingxi
AU - Shi, Mingqian
N1 - Publisher Copyright:
© 2025
PY - 2025/8/15
Y1 - 2025/8/15
N2 - In tackling the optimal safety control problem of spacecraft attitude within the presence of multiple constraint regions, this study proposes a reinforcement learning-based control algorithm rooted in non-zero-sum game theory. Initially, a multi-forbidden region model is devised to capture the spacecraft's attitude dynamics, which is subsequently mapped onto a multi-input non-zero-sum game framework. In the next step, a reinforcement learning strategy, utilizing an “Actor-critic” architecture, is employed to approximate the Nash equilibrium solution of the non-zero-sum game. The evaluation network continuously assesses the spacecraft's current control state across the multiple forbidden regions, thereby guiding the action network to minimize the evaluation network's output. This process facilitates the coordinated management of repulsive forces imposed by various forbidden regions on the spacecraft's attitude, ultimately ensuring the system achieves the Nash equilibrium and optimizes the attitude control despite the imposed constraints. Additionally, leveraging Lyapunov stability theory, the stability of the proposed control strategy is rigorously validated. Finally, simulation results substantiate the effectiveness and robustness of the approach, underscoring its potential for real-world applications within complex constrained environments.
AB - In tackling the optimal safety control problem of spacecraft attitude within the presence of multiple constraint regions, this study proposes a reinforcement learning-based control algorithm rooted in non-zero-sum game theory. Initially, a multi-forbidden region model is devised to capture the spacecraft's attitude dynamics, which is subsequently mapped onto a multi-input non-zero-sum game framework. In the next step, a reinforcement learning strategy, utilizing an “Actor-critic” architecture, is employed to approximate the Nash equilibrium solution of the non-zero-sum game. The evaluation network continuously assesses the spacecraft's current control state across the multiple forbidden regions, thereby guiding the action network to minimize the evaluation network's output. This process facilitates the coordinated management of repulsive forces imposed by various forbidden regions on the spacecraft's attitude, ultimately ensuring the system achieves the Nash equilibrium and optimizes the attitude control despite the imposed constraints. Additionally, leveraging Lyapunov stability theory, the stability of the proposed control strategy is rigorously validated. Finally, simulation results substantiate the effectiveness and robustness of the approach, underscoring its potential for real-world applications within complex constrained environments.
KW - Multiple constraint regions
KW - Neural network
KW - Non-zero-sum game
KW - Reinforcement learning
KW - Spacecraft
UR - https://www.scopus.com/pages/publications/105011582978
U2 - 10.1016/j.jfranklin.2025.107929
DO - 10.1016/j.jfranklin.2025.107929
M3 - 文章
AN - SCOPUS:105011582978
SN - 0016-0032
VL - 362
JO - Journal of the Franklin Institute
JF - Journal of the Franklin Institute
IS - 13
M1 - 107929
ER -