TY - GEN
T1 - Consistency Intelligence Control Method for Coordinated Pursuit of Non-Cooperative Targets in Multi-Spacecraft Systems
AU - Liu, Suyi
AU - Cao, Xuyang
AU - Zhang, Tongshu
AU - Ning, Xin
AU - Lian, Xiaobin
N1 - Publisher Copyright:
Copyright © 2025 by the International Astronautical Federation (IAF). All rights reserved.
PY - 2025
Y1 - 2025
N2 - With the rapid development of space technology, the space mission mode has evolved from single spacecraft global planning to multi-spacecraft formation, cluster, and constellation collaborative mode, providing a robust foundation for achieving tasks with higher complexity and larger spatial scale. Regarding the cutting-edge topic of non-cooperative target group pursuit and encirclement, although existing research has extensively explored methods for cooperative interception of spacecraft clusters in orbit, there are still two major bottlenecks in conventional planning algorithms: The strongly coupled constraints, which make it difficult to effectively handle dynamic allocation problems in concurrent multi-target pursuit-evade scenarios of multiple non cooperative targets; The traditional method of relying on ground-based measurement and control systems for target information acquisition and instruction uploading has significant time delays, making it difficult to adapt to the real-time evolution of dynamic space environments. Therefore, this study proposes a multi-objective multi-round pursuit and evasion method based on Relative Reachable Domains and Multi-Agent Deep Deterministic Policy Gradient (RRD-MADDPG) algorithm, which constructs a theoretical framework for spatial multi-agent pursuit and evasion problems. Firstly, based on high-precision orbital dynamics models and spacecraft relative motion models, a multi-round pulse-based pursuit-evasion model is established between two clusters; Secondly, by combining the realtime orbit status of both clusters, we predict the reachable domain of the trajectory within impulsive maneuver intervals, and construct a dynamic database of the Relative Reachable Domain (RRD) of the non-cooperative targets; Furthermore, an innovative”chase fill” reward mechanism is designed for the multi-agent pursuit and evasion model, which effectively mitigates the convergence problem caused by manual reward function design. Finally, the RRD-MADDPG method is numerically validated in a typical multi-agent pursuit and evasion problem. Preliminary experimental results demonstrate that, compared with the traditional MADDPG method, the proposed approach achieves faster convergence and a higher target capture completion rate, highlighting its potential advantages in dynamic space pursuit-evasion scenarios.
AB - With the rapid development of space technology, the space mission mode has evolved from single spacecraft global planning to multi-spacecraft formation, cluster, and constellation collaborative mode, providing a robust foundation for achieving tasks with higher complexity and larger spatial scale. Regarding the cutting-edge topic of non-cooperative target group pursuit and encirclement, although existing research has extensively explored methods for cooperative interception of spacecraft clusters in orbit, there are still two major bottlenecks in conventional planning algorithms: The strongly coupled constraints, which make it difficult to effectively handle dynamic allocation problems in concurrent multi-target pursuit-evade scenarios of multiple non cooperative targets; The traditional method of relying on ground-based measurement and control systems for target information acquisition and instruction uploading has significant time delays, making it difficult to adapt to the real-time evolution of dynamic space environments. Therefore, this study proposes a multi-objective multi-round pursuit and evasion method based on Relative Reachable Domains and Multi-Agent Deep Deterministic Policy Gradient (RRD-MADDPG) algorithm, which constructs a theoretical framework for spatial multi-agent pursuit and evasion problems. Firstly, based on high-precision orbital dynamics models and spacecraft relative motion models, a multi-round pulse-based pursuit-evasion model is established between two clusters; Secondly, by combining the realtime orbit status of both clusters, we predict the reachable domain of the trajectory within impulsive maneuver intervals, and construct a dynamic database of the Relative Reachable Domain (RRD) of the non-cooperative targets; Furthermore, an innovative”chase fill” reward mechanism is designed for the multi-agent pursuit and evasion model, which effectively mitigates the convergence problem caused by manual reward function design. Finally, the RRD-MADDPG method is numerically validated in a typical multi-agent pursuit and evasion problem. Preliminary experimental results demonstrate that, compared with the traditional MADDPG method, the proposed approach achieves faster convergence and a higher target capture completion rate, highlighting its potential advantages in dynamic space pursuit-evasion scenarios.
KW - Multi-agent deep deterministic policy gradient
KW - Multi-agent reinforcement learning
KW - Pursuit-Evasion Game
KW - Relative Reachable Domain
KW - Spacecraft Cluster Planning
UR - https://www.scopus.com/pages/publications/105036349852
U2 - 10.52202/083087-0130
DO - 10.52202/083087-0130
M3 - 会议稿件
AN - SCOPUS:105036349852
T3 - Proceedings of the International Astronautical Congress, IAC
SP - 1482
EP - 1493
BT - IAF Astrodynamics Symposium - Held at the 76th International Astronautical Congress, IAC 2025
PB - International Astronautical Federation, IAF
T2 - 2025 IAF Astrodynamics Symposium at the 76th International Astronautical Congress, IAC 2025
Y2 - 29 September 2025 through 3 October 2025
ER -