TY - JOUR
T1 - Reinforcement-Learning-Based Counter Deception for Nonlinear Pursuit-Evasion Game With Incomplete and Asymmetric Information
AU - Wang, Yongkang
AU - Cui, Rongxin
AU - Yan, Weisheng
AU - Guo, Xinxin
AU - Zhang, Shouxu
AU - Zhang, Zhuo
AU - Zhao, Zhexuan
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target's deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.
AB - In this article, we investigate the problem of capturing a noncooperative target with deception behavior using reinforcement learning (RL) under incomplete information. The pursuer copes not only with its maneuverability constraint but also with the target's deception behavior, in which the target deliberately conceals its private preference information. The target capture game involving deception behavior is formulated as a nonlinear differential game framework where the information structure is incomplete and asymmetric. The solution to this differential game is proposed based on an RL policy that incorporates critic, actor, and virtual actor neural networks (NNs), when taking into consideration the maneuverability constraint and information structure of the pursuer. Moreover, the states of the constrained adversarial system and the weight errors are proven to be ultimately uniformly bounded (UUB). To counter the deception of the target, we adopt unscented Kalman filter (UKF) to obtain the target intention on energy preference, and integrate it into the pursuer strategy. The feasibility of the proposed strategy and its superiority are verified through comparisons with recent works.
KW - Deception behavior
KW - differential game
KW - incomplete and asymmetric information
KW - maneuverability constraint
KW - pursuit-evasion
KW - reinforcement learning (RL)
UR - http://www.scopus.com/inward/record.url?scp=85219542085&partnerID=8YFLogxK
U2 - 10.1109/TSMC.2025.3541105
DO - 10.1109/TSMC.2025.3541105
M3 - 文章
AN - SCOPUS:85219542085
SN - 2168-2216
JO - IEEE Transactions on Systems, Man, and Cybernetics: Systems
JF - IEEE Transactions on Systems, Man, and Cybernetics: Systems
ER -