Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits

Weizhuo Yu; Chuang Liu; Xiaokui Yue

doi:10.1016/j.conengprac.2024.106072

Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits

Weizhuo Yu, Chuang Liu, Xiaokui Yue

航天学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

7 引用（Scopus）

摘要

The orbital game theory is a fundamental technology for the cleanup of space debris to improve the safety of useful spacecraft in future, thus, this work develops a decision-making method by reinforcement learning technology to implement the pursuit-evasion game in elliptical orbits. The linearized Tschauner-Hempel equation describes the spacecraft's motion and the problem is formulated by game theory. Subsequently, an impulsive maneuvering model in a complete three-dimensional elliptical orbit is established. Then an algorithm based on deep deterministic policy gradient is designed to solve the optimal strategy for the pursuit-evasion game. For the successful decision of the pursuer, an extensive reward function is designed and improved considering the shortest time, optimal fuel, and collision avoidance. Finally, numerical simulations of a pursuit-evasion mission are performed to demonstrate the effectiveness and superiority of the proposed decision-making algorithm. The game success rate of the algorithm against targets with different maneuvering abilities is verified, which implies that the algorithm can be applied in extended scenarios.

源语言	英语
文章编号	106072
期刊	Control Engineering Practice
卷	153
DOI	https://doi.org/10.1016/j.conengprac.2024.106072
出版状态	已出版 - 12月 2024

访问文件

10.1016/j.conengprac.2024.106072

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b0144ce95989474d9ad2d30779bb4ff1,

title = "Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits",

abstract = "The orbital game theory is a fundamental technology for the cleanup of space debris to improve the safety of useful spacecraft in future, thus, this work develops a decision-making method by reinforcement learning technology to implement the pursuit-evasion game in elliptical orbits. The linearized Tschauner-Hempel equation describes the spacecraft's motion and the problem is formulated by game theory. Subsequently, an impulsive maneuvering model in a complete three-dimensional elliptical orbit is established. Then an algorithm based on deep deterministic policy gradient is designed to solve the optimal strategy for the pursuit-evasion game. For the successful decision of the pursuer, an extensive reward function is designed and improved considering the shortest time, optimal fuel, and collision avoidance. Finally, numerical simulations of a pursuit-evasion mission are performed to demonstrate the effectiveness and superiority of the proposed decision-making algorithm. The game success rate of the algorithm against targets with different maneuvering abilities is verified, which implies that the algorithm can be applied in extended scenarios.",

keywords = "Decision making, Deep deterministic policy gradient, Elliptical orbit, Impulsive maneuver, Pursuit-evasion game",

author = "Weizhuo Yu and Chuang Liu and Xiaokui Yue",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Ltd",

year = "2024",

month = dec,

doi = "10.1016/j.conengprac.2024.106072",

language = "英语",

volume = "153",

journal = "Control Engineering Practice",

issn = "0967-0661",

publisher = "Elsevier Ltd",

}

TY - JOUR

T1 - Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits

AU - Yu, Weizhuo

AU - Liu, Chuang

AU - Yue, Xiaokui

PY - 2024/12

Y1 - 2024/12

N2 - The orbital game theory is a fundamental technology for the cleanup of space debris to improve the safety of useful spacecraft in future, thus, this work develops a decision-making method by reinforcement learning technology to implement the pursuit-evasion game in elliptical orbits. The linearized Tschauner-Hempel equation describes the spacecraft's motion and the problem is formulated by game theory. Subsequently, an impulsive maneuvering model in a complete three-dimensional elliptical orbit is established. Then an algorithm based on deep deterministic policy gradient is designed to solve the optimal strategy for the pursuit-evasion game. For the successful decision of the pursuer, an extensive reward function is designed and improved considering the shortest time, optimal fuel, and collision avoidance. Finally, numerical simulations of a pursuit-evasion mission are performed to demonstrate the effectiveness and superiority of the proposed decision-making algorithm. The game success rate of the algorithm against targets with different maneuvering abilities is verified, which implies that the algorithm can be applied in extended scenarios.

AB - The orbital game theory is a fundamental technology for the cleanup of space debris to improve the safety of useful spacecraft in future, thus, this work develops a decision-making method by reinforcement learning technology to implement the pursuit-evasion game in elliptical orbits. The linearized Tschauner-Hempel equation describes the spacecraft's motion and the problem is formulated by game theory. Subsequently, an impulsive maneuvering model in a complete three-dimensional elliptical orbit is established. Then an algorithm based on deep deterministic policy gradient is designed to solve the optimal strategy for the pursuit-evasion game. For the successful decision of the pursuer, an extensive reward function is designed and improved considering the shortest time, optimal fuel, and collision avoidance. Finally, numerical simulations of a pursuit-evasion mission are performed to demonstrate the effectiveness and superiority of the proposed decision-making algorithm. The game success rate of the algorithm against targets with different maneuvering abilities is verified, which implies that the algorithm can be applied in extended scenarios.

KW - Decision making

KW - Deep deterministic policy gradient

KW - Elliptical orbit

KW - Impulsive maneuver

KW - Pursuit-evasion game

UR - http://www.scopus.com/inward/record.url?scp=85203021983&partnerID=8YFLogxK

U2 - 10.1016/j.conengprac.2024.106072

DO - 10.1016/j.conengprac.2024.106072

M3 - 文章

AN - SCOPUS:85203021983

SN - 0967-0661

VL - 153

JO - Control Engineering Practice

JF - Control Engineering Practice

M1 - 106072

ER -

Reinforcement learning-based decision-making for spacecraft pursuit-evasion game in elliptical orbits

摘要

访问文件

其它文件与链接

指纹

引用此