Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning

Zhenrui Lv; Yifan Hu; Zijing Tian; Bin Fu; Hongguang Ren; Wenxing Fu

doi:10.1007/978-981-96-3568-9_42

Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning

Zhenrui Lv, Yifan Hu, Zijing Tian, Bin Fu, Hongguang Ren, Wenxing Fu

无人系统技术研究院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

In response to the common assumption of small angle relationships in existing collaborative guidance laws and the neglect of high-order terms in the remaining time expansion, this paper proposes a guidance law structure based on a combination of traditional guidance laws and collaborative correction terms, and uses reinforcement learning methods to train the correction terms. This article also constructs a guided pre training algorithm based on offline reinforcement learning algorithms, combined with the dual delay deep deterministic policy gradient algorithm. Through methods such as delayed updates and critical comparison, fast and efficient learning and training iterations are carried out, effectively solving the problem of overestimation of actions and policies in the reinforcement learning process. The simulation results show that the reinforcement learning collaborative guidance law trained by the designed framework has obvious advantages of wider applicability and higher time collaboration accuracy.

源语言	英语
主期刊名	Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024
编辑	Lianqing Liu, Yifeng Niu, Wenxing Fu, Yi Qu
出版商	Springer Science and Business Media Deutschland GmbH
页	443-453
页数	11
ISBN（印刷版）	9789819635672
DOI	https://doi.org/10.1007/978-981-96-3568-9_42
出版状态	已出版 - 2025
活动	4th International Conference on Autonomous Unmanned Systems, ICAUS 2024 - Shenyang, 中国期限: 19 9月 2024 → 21 9月 2024

出版系列

姓名	Lecture Notes in Electrical Engineering
卷	1377 LNEE
ISSN（印刷版）	1876-1100
ISSN（电子版）	1876-1119

会议

会议	4th International Conference on Autonomous Unmanned Systems, ICAUS 2024
国家/地区	中国
市	Shenyang
时期	19/09/24 → 21/09/24

访问文件

10.1007/978-981-96-3568-9_42

其它文件与链接

链接到 Scopus 的出版物

引用此

Lv, Z., Hu, Y., Tian, Z., Fu, B., Ren, H., & Fu, W. (2025). Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning. 在 L. Liu, Y. Niu, W. Fu, & Y. Qu (编辑), Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024 (页码 443-453). (Lecture Notes in Electrical Engineering; 卷 1377 LNEE). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-96-3568-9_42

Lv, Zhenrui ; Hu, Yifan ; Tian, Zijing 等. / Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning. Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024. 编辑 / Lianqing Liu ; Yifeng Niu ; Wenxing Fu ; Yi Qu. Springer Science and Business Media Deutschland GmbH, 2025. 页码 443-453 (Lecture Notes in Electrical Engineering).

@inproceedings{df0f1a5fbe324abe9fc28acbfa01e268,

title = "Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning",

abstract = "In response to the common assumption of small angle relationships in existing collaborative guidance laws and the neglect of high-order terms in the remaining time expansion, this paper proposes a guidance law structure based on a combination of traditional guidance laws and collaborative correction terms, and uses reinforcement learning methods to train the correction terms. This article also constructs a guided pre training algorithm based on offline reinforcement learning algorithms, combined with the dual delay deep deterministic policy gradient algorithm. Through methods such as delayed updates and critical comparison, fast and efficient learning and training iterations are carried out, effectively solving the problem of overestimation of actions and policies in the reinforcement learning process. The simulation results show that the reinforcement learning collaborative guidance law trained by the designed framework has obvious advantages of wider applicability and higher time collaboration accuracy.",

keywords = "Collaborative guidance, Reinforcement learning, Time collaboration",

author = "Zhenrui Lv and Yifan Hu and Zijing Tian and Bin Fu and Hongguang Ren and Wenxing Fu",

note = "Publisher Copyright: {\textcopyright} Beijing HIWING Scientific and Technological Information Institute 2025.; 4th International Conference on Autonomous Unmanned Systems, ICAUS 2024 ; Conference date: 19-09-2024 Through 21-09-2024",

year = "2025",

doi = "10.1007/978-981-96-3568-9_42",

language = "英语",

isbn = "9789819635672",

series = "Lecture Notes in Electrical Engineering",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "443--453",

editor = "Lianqing Liu and Yifeng Niu and Wenxing Fu and Yi Qu",

booktitle = "Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024",

}

Lv, Z, Hu, Y, Tian, Z, Fu, B, Ren, H & Fu, W 2025, Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning. 在 L Liu, Y Niu, W Fu & Y Qu (编辑), Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024. Lecture Notes in Electrical Engineering, 卷 1377 LNEE, Springer Science and Business Media Deutschland GmbH, 页码 443-453, 4th International Conference on Autonomous Unmanned Systems, ICAUS 2024, Shenyang, 中国, 19/09/24. https://doi.org/10.1007/978-981-96-3568-9_42

Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning. / Lv, Zhenrui; Hu, Yifan; Tian, Zijing 等.
Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024. 编辑 / Lianqing Liu; Yifeng Niu; Wenxing Fu; Yi Qu. Springer Science and Business Media Deutschland GmbH, 2025. 页码 443-453 (Lecture Notes in Electrical Engineering; 卷 1377 LNEE).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning

AU - Lv, Zhenrui

AU - Hu, Yifan

AU - Tian, Zijing

AU - Fu, Bin

AU - Ren, Hongguang

AU - Fu, Wenxing

N1 - Publisher Copyright: © Beijing HIWING Scientific and Technological Information Institute 2025.

PY - 2025

Y1 - 2025

N2 - In response to the common assumption of small angle relationships in existing collaborative guidance laws and the neglect of high-order terms in the remaining time expansion, this paper proposes a guidance law structure based on a combination of traditional guidance laws and collaborative correction terms, and uses reinforcement learning methods to train the correction terms. This article also constructs a guided pre training algorithm based on offline reinforcement learning algorithms, combined with the dual delay deep deterministic policy gradient algorithm. Through methods such as delayed updates and critical comparison, fast and efficient learning and training iterations are carried out, effectively solving the problem of overestimation of actions and policies in the reinforcement learning process. The simulation results show that the reinforcement learning collaborative guidance law trained by the designed framework has obvious advantages of wider applicability and higher time collaboration accuracy.

AB - In response to the common assumption of small angle relationships in existing collaborative guidance laws and the neglect of high-order terms in the remaining time expansion, this paper proposes a guidance law structure based on a combination of traditional guidance laws and collaborative correction terms, and uses reinforcement learning methods to train the correction terms. This article also constructs a guided pre training algorithm based on offline reinforcement learning algorithms, combined with the dual delay deep deterministic policy gradient algorithm. Through methods such as delayed updates and critical comparison, fast and efficient learning and training iterations are carried out, effectively solving the problem of overestimation of actions and policies in the reinforcement learning process. The simulation results show that the reinforcement learning collaborative guidance law trained by the designed framework has obvious advantages of wider applicability and higher time collaboration accuracy.

KW - Collaborative guidance

KW - Reinforcement learning

KW - Time collaboration

UR - http://www.scopus.com/inward/record.url?scp=105003303687&partnerID=8YFLogxK

U2 - 10.1007/978-981-96-3568-9_42

DO - 10.1007/978-981-96-3568-9_42

M3 - 会议稿件

AN - SCOPUS:105003303687

SN - 9789819635672

T3 - Lecture Notes in Electrical Engineering

SP - 443

EP - 453

BT - Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024

A2 - Liu, Lianqing

A2 - Niu, Yifeng

A2 - Fu, Wenxing

A2 - Qu, Yi

PB - Springer Science and Business Media Deutschland GmbH

T2 - 4th International Conference on Autonomous Unmanned Systems, ICAUS 2024

Y2 - 19 September 2024 through 21 September 2024

ER -

Lv Z, Hu Y, Tian Z, Fu B, Ren H, Fu W. Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning. 在 Liu L, Niu Y, Fu W, Qu Y, 编辑, Proceedings of 4th 2024 International Conference on Autonomous Unmanned Systems, 4th ICAUS 2024. Springer Science and Business Media Deutschland GmbH. 2025. 页码 443-453. (Lecture Notes in Electrical Engineering). doi: 10.1007/978-981-96-3568-9_42

Collaborative Guidance Algorithm Based on Offline Pre-training and Online Reinforcement Learning

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此