Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit-Evasion Game With Incomplete and Asymmetric Information

Yongkang Wang; Rongxin Cui; Weisheng Yan; Xinxin Guo; Shouxu Zhang; Zhuo Zhang; Zhexuan Zhao

doi:10.1109/TSMC.2025.3541105

Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit-Evasion Game With Incomplete and Asymmetric Information

Yongkang Wang, Rongxin Cui, Weisheng Yan, Xinxin Guo, Shouxu Zhang, Zhuo Zhang, Zhexuan Zhao

航海学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target's deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.

源语言	英语
期刊	IEEE Transactions on Systems, Man, and Cybernetics: Systems
DOI	https://doi.org/10.1109/TSMC.2025.3541105
出版状态	已接受/待刊 - 2025

访问文件

10.1109/TSMC.2025.3541105

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{876ead8f04d24e968e18462b0ebbbf6f,

title = "Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit-Evasion Game With Incomplete and Asymmetric Information",

abstract = "In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target's deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.",

keywords = "Deception behavior, differential game, incomplete and asymmetric information, maneuverability constraint, pursuit-evasion, reinforcement learning (RL)",

author = "Yongkang Wang and Rongxin Cui and Weisheng Yan and Xinxin Guo and Shouxu Zhang and Zhuo Zhang and Zhexuan Zhao",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2025",

doi = "10.1109/TSMC.2025.3541105",

language = "英语",

journal = "IEEE Transactions on Systems, Man, and Cybernetics: Systems",

issn = "2168-2216",

publisher = "IEEE Advancing Technology for Humanity",

}

TY - JOUR

T1 - Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit-Evasion Game With Incomplete and Asymmetric Information

AU - Wang, Yongkang

AU - Cui, Rongxin

AU - Yan, Weisheng

AU - Guo, Xinxin

AU - Zhang, Shouxu

AU - Zhang, Zhuo

AU - Zhao, Zhexuan

PY - 2025

Y1 - 2025

N2 - In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target's deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.

AB - In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target's deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.

KW - Deception behavior

KW - differential game

KW - incomplete and asymmetric information

KW - maneuverability constraint

KW - pursuit-evasion

KW - reinforcement learning (RL)

UR - http://www.scopus.com/inward/record.url?scp=85219542085&partnerID=8YFLogxK

U2 - 10.1109/TSMC.2025.3541105

DO - 10.1109/TSMC.2025.3541105

M3 - 文章

AN - SCOPUS:85219542085

SN - 2168-2216

JO - IEEE Transactions on Systems, Man, and Cybernetics: Systems

JF - IEEE Transactions on Systems, Man, and Cybernetics: Systems

ER -

Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit-Evasion Game With Incomplete and Asymmetric Information

摘要

访问文件

其它文件与链接

指纹

引用此