Memory-extraction-based DRL cooperative guidance against the maneuvering target protected by interceptors

Hao Sun; Shi Yan; Yan Liang; Chaoxiong Ma; Tao Zhang; Liuyu Pei

doi:10.1016/j.ast.2024.109575

Memory-extraction-based DRL cooperative guidance against the maneuvering target protected by interceptors

Hao Sun, Shi Yan, Yan Liang, Chaoxiong Ma, Tao Zhang, Liuyu Pei

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

This paper presents an open and interesting issue for missiles, i.e., achieving collaborative parameters constrained cooperative guidance, despite the interference of pursing interceptors (INTs) and the maneuvering target, by the fact that the target-missile-interceptor (TMI) engagement induces their complex and time-varying relationships. The Memory-Extraction-based Soft-Actor-Critic (ME-SAC) approach is proposed, which enhances the collaborative performance of missiles by implicitly extracting coupling motion characteristics among TMI from historical state, achieving the joint optimization of situation awareness and strategy. Firstly, the cooperative guidance task is formulated as a multi-order Markov decision process (MOMDP) to better represent the dynamic evolution of engagement, and a memory-extraction process is introduced to alleviate the curse of dimensionality. Secondly, a memory-decision-oriented maximum entropy framework combined with memory update modules is designed for enhancing strategy search ability. Then, a domain-knowledge-based pre-training is implemented to improve convergence speed. Finally, in simulation evaluation with various scenarios, the proposed ME-SAC shows more promising than the typical DRL-based and model-based algorithms in task success rate, learning efficiency, and adaptability.

源语言	英语
文章编号	109575
期刊	Aerospace Science and Technology
卷	155
DOI	https://doi.org/10.1016/j.ast.2024.109575
出版状态	已出版 - 12月 2024

访问文件

10.1016/j.ast.2024.109575

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b74659549efa46c591770bf84a8f5256,

title = "Memory-extraction-based DRL cooperative guidance against the maneuvering target protected by interceptors",

abstract = "This paper presents an open and interesting issue for missiles, i.e., achieving collaborative parameters constrained cooperative guidance, despite the interference of pursing interceptors (INTs) and the maneuvering target, by the fact that the target-missile-interceptor (TMI) engagement induces their complex and time-varying relationships. The Memory-Extraction-based Soft-Actor-Critic (ME-SAC) approach is proposed, which enhances the collaborative performance of missiles by implicitly extracting coupling motion characteristics among TMI from historical state, achieving the joint optimization of situation awareness and strategy. Firstly, the cooperative guidance task is formulated as a multi-order Markov decision process (MOMDP) to better represent the dynamic evolution of engagement, and a memory-extraction process is introduced to alleviate the curse of dimensionality. Secondly, a memory-decision-oriented maximum entropy framework combined with memory update modules is designed for enhancing strategy search ability. Then, a domain-knowledge-based pre-training is implemented to improve convergence speed. Finally, in simulation evaluation with various scenarios, the proposed ME-SAC shows more promising than the typical DRL-based and model-based algorithms in task success rate, learning efficiency, and adaptability.",

keywords = "Cooperative guidance, Deep reinforcement learning, Maneuvering target, Missiles, Multi-order Markov decision process, Spatio-temporal memory extraction",

author = "Hao Sun and Shi Yan and Yan Liang and Chaoxiong Ma and Tao Zhang and Liuyu Pei",

note = "Publisher Copyright: {\textcopyright} 2024",

year = "2024",

month = dec,

doi = "10.1016/j.ast.2024.109575",

language = "英语",

volume = "155",

journal = "Aerospace Science and Technology",

issn = "1270-9638",

publisher = "Elsevier Masson s.r.l.",

}

TY - JOUR

T1 - Memory-extraction-based DRL cooperative guidance against the maneuvering target protected by interceptors

AU - Sun, Hao

AU - Yan, Shi

AU - Liang, Yan

AU - Ma, Chaoxiong

AU - Zhang, Tao

AU - Pei, Liuyu

PY - 2024/12

Y1 - 2024/12

N2 - This paper presents an open and interesting issue for missiles, i.e., achieving collaborative parameters constrained cooperative guidance, despite the interference of pursing interceptors (INTs) and the maneuvering target, by the fact that the target-missile-interceptor (TMI) engagement induces their complex and time-varying relationships. The Memory-Extraction-based Soft-Actor-Critic (ME-SAC) approach is proposed, which enhances the collaborative performance of missiles by implicitly extracting coupling motion characteristics among TMI from historical state, achieving the joint optimization of situation awareness and strategy. Firstly, the cooperative guidance task is formulated as a multi-order Markov decision process (MOMDP) to better represent the dynamic evolution of engagement, and a memory-extraction process is introduced to alleviate the curse of dimensionality. Secondly, a memory-decision-oriented maximum entropy framework combined with memory update modules is designed for enhancing strategy search ability. Then, a domain-knowledge-based pre-training is implemented to improve convergence speed. Finally, in simulation evaluation with various scenarios, the proposed ME-SAC shows more promising than the typical DRL-based and model-based algorithms in task success rate, learning efficiency, and adaptability.

AB - This paper presents an open and interesting issue for missiles, i.e., achieving collaborative parameters constrained cooperative guidance, despite the interference of pursing interceptors (INTs) and the maneuvering target, by the fact that the target-missile-interceptor (TMI) engagement induces their complex and time-varying relationships. The Memory-Extraction-based Soft-Actor-Critic (ME-SAC) approach is proposed, which enhances the collaborative performance of missiles by implicitly extracting coupling motion characteristics among TMI from historical state, achieving the joint optimization of situation awareness and strategy. Firstly, the cooperative guidance task is formulated as a multi-order Markov decision process (MOMDP) to better represent the dynamic evolution of engagement, and a memory-extraction process is introduced to alleviate the curse of dimensionality. Secondly, a memory-decision-oriented maximum entropy framework combined with memory update modules is designed for enhancing strategy search ability. Then, a domain-knowledge-based pre-training is implemented to improve convergence speed. Finally, in simulation evaluation with various scenarios, the proposed ME-SAC shows more promising than the typical DRL-based and model-based algorithms in task success rate, learning efficiency, and adaptability.

KW - Cooperative guidance

KW - Deep reinforcement learning

KW - Maneuvering target

KW - Missiles

KW - Multi-order Markov decision process

KW - Spatio-temporal memory extraction

UR - http://www.scopus.com/inward/record.url?scp=85204774792&partnerID=8YFLogxK

U2 - 10.1016/j.ast.2024.109575

DO - 10.1016/j.ast.2024.109575

M3 - 文章

AN - SCOPUS:85204774792

SN - 1270-9638

VL - 155

JO - Aerospace Science and Technology

JF - Aerospace Science and Technology

M1 - 109575

ER -

Memory-extraction-based DRL cooperative guidance against the maneuvering target protected by interceptors

摘要

访问文件

其它文件与链接

指纹

引用此