基于无量纲模型的空地导弹强化学习制导律

Xiaoyang Huang; Jun Zhou; Bin Zhao; Xinpeng Xu; Yuheng Shen

doi:10.3873/j.issn.1000-1328.2024.09.010

基于无量纲模型的空地导弹强化学习制导律

Xiaoyang Huang, Jun Zhou, Bin Zhao, Xinpeng Xu, Yuheng Shen

航天学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

To tackle the terminal angle guidance conundrum in air-to-ground missile strikes,a reinforcement learning approach based on dimensionless modeling and terminal rewards is presented. Through establishing a dimensionless model from the flight dynamics of missiles,this method shrinks the size of the state and observation space in the reinforcement learning environment,enhancing the training efficiency for angle-constrained guidance. It adopts a reinforcement strategy based on terminal rewards that takes into account the accuracy of hits and attack angles,circumventing the reward sparsity problem in conventional reinforcement learning. Utilizing the deep deterministic policy gradient algorithm,it conducts guidance law training optimized for inputs in typical scenarios. Simulation outcomes indicate that this method surpasses existing ones in the accuracy of hits and attack angles,demands less overload,and effectively resolves the issues of high computational requirements and low efficiency of current reinforcement learning guidance techniques, thereby demonstrating its practical application potential.

投稿的翻译标题	Reinforcement Learning-Based Terminal Constrained Guidance Law for Air-to-Ground Missiles Based on Dimensionless Models
源语言	繁体中文
页（从-至）	1445-1455
页数	11
期刊	Yuhang Xuebao/Journal of Astronautics
卷	45
期	9
DOI	https://doi.org/10.3873/j.issn.1000-1328.2024.09.010
出版状态	已出版 - 9月 2024

关键词

Attack angle constraint
Deep deterministic policy Gradient algorithm(DDPG)
Deep reinforcement learning (DRL)
Dimensionless model
Terminal reward function

访问文件

10.3873/j.issn.1000-1328.2024.09.010

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f9871752bedf4832969daccc43f09601,

title = "基于无量纲模型的空地导弹强化学习制导律",

abstract = "To tackle the terminal angle guidance conundrum in air-to-ground missile strikes,a reinforcement learning approach based on dimensionless modeling and terminal rewards is presented. Through establishing a dimensionless model from the flight dynamics of missiles,this method shrinks the size of the state and observation space in the reinforcement learning environment,enhancing the training efficiency for angle-constrained guidance. It adopts a reinforcement strategy based on terminal rewards that takes into account the accuracy of hits and attack angles,circumventing the reward sparsity problem in conventional reinforcement learning. Utilizing the deep deterministic policy gradient algorithm,it conducts guidance law training optimized for inputs in typical scenarios. Simulation outcomes indicate that this method surpasses existing ones in the accuracy of hits and attack angles,demands less overload,and effectively resolves the issues of high computational requirements and low efficiency of current reinforcement learning guidance techniques, thereby demonstrating its practical application potential.",

keywords = "Attack angle constraint, Deep deterministic policy Gradient algorithm(DDPG), Deep reinforcement learning (DRL), Dimensionless model, Terminal reward function",

author = "Xiaoyang Huang and Jun Zhou and Bin Zhao and Xinpeng Xu and Yuheng Shen",

year = "2024",

month = sep,

doi = "10.3873/j.issn.1000-1328.2024.09.010",

language = "繁体中文",

volume = "45",

pages = "1445--1455",

journal = "Yuhang Xuebao/Journal of Astronautics",

issn = "1000-1328",

publisher = "Chinese Society of Astronautics",

number = "9",

}

TY - JOUR

T1 - 基于无量纲模型的空地导弹强化学习制导律

AU - Huang, Xiaoyang

AU - Zhou, Jun

AU - Zhao, Bin

AU - Xu, Xinpeng

AU - Shen, Yuheng

PY - 2024/9

Y1 - 2024/9

N2 - To tackle the terminal angle guidance conundrum in air-to-ground missile strikes,a reinforcement learning approach based on dimensionless modeling and terminal rewards is presented. Through establishing a dimensionless model from the flight dynamics of missiles,this method shrinks the size of the state and observation space in the reinforcement learning environment,enhancing the training efficiency for angle-constrained guidance. It adopts a reinforcement strategy based on terminal rewards that takes into account the accuracy of hits and attack angles,circumventing the reward sparsity problem in conventional reinforcement learning. Utilizing the deep deterministic policy gradient algorithm,it conducts guidance law training optimized for inputs in typical scenarios. Simulation outcomes indicate that this method surpasses existing ones in the accuracy of hits and attack angles,demands less overload,and effectively resolves the issues of high computational requirements and low efficiency of current reinforcement learning guidance techniques, thereby demonstrating its practical application potential.

AB - To tackle the terminal angle guidance conundrum in air-to-ground missile strikes,a reinforcement learning approach based on dimensionless modeling and terminal rewards is presented. Through establishing a dimensionless model from the flight dynamics of missiles,this method shrinks the size of the state and observation space in the reinforcement learning environment,enhancing the training efficiency for angle-constrained guidance. It adopts a reinforcement strategy based on terminal rewards that takes into account the accuracy of hits and attack angles,circumventing the reward sparsity problem in conventional reinforcement learning. Utilizing the deep deterministic policy gradient algorithm,it conducts guidance law training optimized for inputs in typical scenarios. Simulation outcomes indicate that this method surpasses existing ones in the accuracy of hits and attack angles,demands less overload,and effectively resolves the issues of high computational requirements and low efficiency of current reinforcement learning guidance techniques, thereby demonstrating its practical application potential.

KW - Attack angle constraint

KW - Deep deterministic policy Gradient algorithm(DDPG)

KW - Deep reinforcement learning (DRL)

KW - Dimensionless model

KW - Terminal reward function

UR - http://www.scopus.com/inward/record.url?scp=85214465053&partnerID=8YFLogxK

U2 - 10.3873/j.issn.1000-1328.2024.09.010

DO - 10.3873/j.issn.1000-1328.2024.09.010

M3 - 文章

AN - SCOPUS:85214465053

SN - 1000-1328

VL - 45

SP - 1445

EP - 1455

JO - Yuhang Xuebao/Journal of Astronautics

JF - Yuhang Xuebao/Journal of Astronautics

IS - 9

ER -

基于无量纲模型的空地导弹强化学习制导律

摘要

关键词

访问文件

其它文件与链接

指纹

引用此