CBPE-Based Assignment Policy for Multiple Player Pursuit-Evasion Game by D3QN Algorithm Applied to Uncrewed Vehicles

Research output: Contribution to journalArticlepeer-review

Abstract

This article proposes novel assignment policies based on constant-bearing and pure-evading control policies (CBPE-based assignment policies) for a multiple player pursuit-evasion (MP-PE) game by a dueling double deep Q-leaning (D3QN) algorithm applied to uncrewed vehicles (UVs). In the MP-PE game, the D3QN algorithm, which incorporates deep, dueling and double Q-learning methods is used for training CBPE-based assignment policies for both pursuer and evader teams. In this case, Nash equilibrium (NE) approximate solutions of CBPE-based assignment policies are derived, which are equivalent to NE control policies for the MP-PE game. The goal of this article is to develop approximate NE CBPE-based assignment policies trained by the D3QN algorithm which are equivalent to NE control policies for both pursuer and evader teams in the MP-PE game. Subsequently, sufficient conditions are provided for analyzing the equivalence between NE CBPE-based assignment policies and NE control policies. Furthermore, with the compression mapping condition, the convergence and optimality of the D3QN algorithm are obtained for NE CBPE-based assignment policies of the MP-PE game. Finally, illustrative examples and experiments are provided to demonstrate the feasibility and usefulness of the proposed NE CBPE-based assignment policies trained by the D3QN algorithm.

Original languageEnglish
JournalIEEE Transactions on Industrial Electronics
DOIs
StateAccepted/In press - 2026

Keywords

  • Dueling double deep Q-leaning (D3QN) algorithm
  • multiple player pursuit-evasion (MP-PE) game
  • Nash equilibrium (NE) solutions
  • uncrewed vehicles (UVs)

Fingerprint

Dive into the research topics of 'CBPE-Based Assignment Policy for Multiple Player Pursuit-Evasion Game by D3QN Algorithm Applied to Uncrewed Vehicles'. Together they form a unique fingerprint.

Cite this