Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Scopus citations

Abstract

Based on a robust multi-agent reinforcement learning (MARL) algorithm framework, an autonomous maneuver decision-making algorithm for UCAV air combat in one-on-one combat in the visible range is designed and implemented. This algorithm can solve the problem that the single agent reinforcement learning algorithm cannot converge during the training process due to the unstable environment. At the same time, considering the shortcomings of the MADDPG algorithm in a strong competitive environment, it is easy to obtain a very fragile strategy, which is only targeted at a specific equilibrium strategy. In this paper, a minimax module is introduced to obtain the expected perturbation, which can locally approach the worst-case perturbation through the gradient. Through simulation tests of algorithm convergence and policy quality, the algorithm is found to be effective.

Original languageEnglish
Title of host publication2020 IEEE 16th International Conference on Control and Automation, ICCA 2020
PublisherIEEE Computer Society
Pages506-512
Number of pages7
ISBN (Electronic)9781728190938
DOIs
StatePublished - 9 Oct 2020
Event16th IEEE International Conference on Control and Automation, ICCA 2020 - Virtual, Sapporo, Hokkaido, Japan
Duration: 9 Oct 202011 Oct 2020

Publication series

NameIEEE International Conference on Control and Automation, ICCA
Volume2020-October
ISSN (Print)1948-3449
ISSN (Electronic)1948-3457

Conference

Conference16th IEEE International Conference on Control and Automation, ICCA 2020
Country/TerritoryJapan
CityVirtual, Sapporo, Hokkaido
Period9/10/2011/10/20

Keywords

  • Air combat
  • Maneuver strategy
  • Reinforcement learning
  • Robust MADDPG

Fingerprint

Dive into the research topics of 'Air combat autonomous maneuver decision for one-on-one within visual range engagement base on robust multi-agent reinforcement learning'. Together they form a unique fingerprint.

Cite this