Game Maneuver Decision-Making for Multi-UAV via PPO-A3C-PER Learning Method

Beibei Qiao, Zhenshuai Jia, Bing Xiao, Hanyu Qian

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Aiming at the problems of long training time, poor flexibility of unmanned aerial vehicle (UAV), and low utilization efficiency of experience pool samples in deep reinforcement learning training for multi-UAV, a multi-UAV maneuver decision-making method based on continuous strategic action sets is proposed. The PPO-A3C-PER algorithm is proposed to solve the problem of long training time of PPO algorithm. Four intelligent maneuvering strategies are proposed to solve the problem of sluggish UAV performance in the multi-UAV game. Design corresponding reward functions for the four strategic behaviors of reconnaissance, pursuit, encirclement, and expulsion., UAVs can complete roundup tasks in different scenarios, and the reinforcement learning algorithm based on the Prioritized Experience Replay and Asynchronous Advantage Actor-Critic method can effectively improve the efficiency of utilizing the samples in the experience pool. Simulation results show that the algorithm has a faster convergence speed than the PPO algorithm in the training phase, the training time is shortened by 39.71% and the targeting rate is improved by 26.32% compared with the PPO algorithm in the same environment.

Original languageEnglish
Title of host publicationAdvances in Guidance, Navigation and Control - Proceedings of 2024 International Conference on Guidance, Navigation and Control Volume 9
EditorsLiang Yan, Haibin Duan, Yimin Deng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages72-81
Number of pages10
ISBN (Print)9789819622313
DOIs
StatePublished - 2025
EventInternational Conference on Guidance, Navigation and Control, ICGNC 2024 - Changsha, China
Duration: 9 Aug 202411 Aug 2024

Publication series

NameLecture Notes in Electrical Engineering
Volume1345 LNEE
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

ConferenceInternational Conference on Guidance, Navigation and Control, ICGNC 2024
Country/TerritoryChina
CityChangsha
Period9/08/2411/08/24

Keywords

  • A3C
  • making
  • maneuver decision
  • maneuvering strategies
  • multiple
  • PER
  • PPO
  • reinforcement learning
  • UAV

Fingerprint

Dive into the research topics of 'Game Maneuver Decision-Making for Multi-UAV via PPO-A3C-PER Learning Method'. Together they form a unique fingerprint.

Cite this