Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment

Penglin Hu; Quan Pan; Chunhui Zhao; Yaning Guo

doi:10.1002/asjc.3328

Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment

Penglin Hu, Quan Pan, Chunhui Zhao, Yaning Guo

School of Automation

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

In this paper, we study the multi-pursuer single-evader pursuit-evasion (MSPE) differential game in a continuous environment with the consideration of obstacles. We propose a novel pursuit-evasion algorithm based on reinforcement learning and transfer learning. In the source task learning stage, we employ the Q-learning and value function approximation method to overcome the challenges posed by the large-scale storage space required by the conventional Q-table solution method. This approach expands the discrete space to the continuous space by value function approximation and effectively reduces the demand for storage space. During the target task learning stage, we utilize the Gaussian mixture model (GMM) to classify the source tasks. The source policies whose corresponding state-value sets have the highest probability densities are assigned for the agent in the target task for learning. This methodology not only effectively avoids negative transfer but also enhances the algorithm's generalization ability and convergence speed. Through simulation and experiment, we demonstrate the algorithm's effectiveness.

Original language	English
Pages (from-to)	2125-2140
Number of pages	16
Journal	Asian Journal of Control
Volume	26
Issue number	4
DOIs	https://doi.org/10.1002/asjc.3328
State	Published - Jul 2024

Keywords

Gaussian mixture model
Q-learning
pursuit-evasion game
recursive least squares
transfer reinforcement learning
value function approximation

Access to Document

10.1002/asjc.3328

Cite this

@article{b061b99852d84c2aa77b361531a9a03d,

title = "Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment",

abstract = "In this paper, we study the multi-pursuer single-evader pursuit-evasion (MSPE) differential game in a continuous environment with the consideration of obstacles. We propose a novel pursuit-evasion algorithm based on reinforcement learning and transfer learning. In the source task learning stage, we employ the Q-learning and value function approximation method to overcome the challenges posed by the large-scale storage space required by the conventional Q-table solution method. This approach expands the discrete space to the continuous space by value function approximation and effectively reduces the demand for storage space. During the target task learning stage, we utilize the Gaussian mixture model (GMM) to classify the source tasks. The source policies whose corresponding state-value sets have the highest probability densities are assigned for the agent in the target task for learning. This methodology not only effectively avoids negative transfer but also enhances the algorithm's generalization ability and convergence speed. Through simulation and experiment, we demonstrate the algorithm's effectiveness.",

keywords = "Gaussian mixture model, Q-learning, pursuit-evasion game, recursive least squares, transfer reinforcement learning, value function approximation",

author = "Penglin Hu and Quan Pan and Chunhui Zhao and Yaning Guo",

note = "Publisher Copyright: {\textcopyright} 2024 Chinese Automatic Control Society and John Wiley & Sons Australia, Ltd.",

year = "2024",

month = jul,

doi = "10.1002/asjc.3328",

language = "英语",

volume = "26",

pages = "2125--2140",

journal = "Asian Journal of Control",

issn = "1561-8625",

publisher = "Wiley-Blackwell",

number = "4",

}

TY - JOUR

T1 - Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment

AU - Hu, Penglin

AU - Pan, Quan

AU - Zhao, Chunhui

AU - Guo, Yaning

PY - 2024/7

Y1 - 2024/7

N2 - In this paper, we study the multi-pursuer single-evader pursuit-evasion (MSPE) differential game in a continuous environment with the consideration of obstacles. We propose a novel pursuit-evasion algorithm based on reinforcement learning and transfer learning. In the source task learning stage, we employ the Q-learning and value function approximation method to overcome the challenges posed by the large-scale storage space required by the conventional Q-table solution method. This approach expands the discrete space to the continuous space by value function approximation and effectively reduces the demand for storage space. During the target task learning stage, we utilize the Gaussian mixture model (GMM) to classify the source tasks. The source policies whose corresponding state-value sets have the highest probability densities are assigned for the agent in the target task for learning. This methodology not only effectively avoids negative transfer but also enhances the algorithm's generalization ability and convergence speed. Through simulation and experiment, we demonstrate the algorithm's effectiveness.

AB - In this paper, we study the multi-pursuer single-evader pursuit-evasion (MSPE) differential game in a continuous environment with the consideration of obstacles. We propose a novel pursuit-evasion algorithm based on reinforcement learning and transfer learning. In the source task learning stage, we employ the Q-learning and value function approximation method to overcome the challenges posed by the large-scale storage space required by the conventional Q-table solution method. This approach expands the discrete space to the continuous space by value function approximation and effectively reduces the demand for storage space. During the target task learning stage, we utilize the Gaussian mixture model (GMM) to classify the source tasks. The source policies whose corresponding state-value sets have the highest probability densities are assigned for the agent in the target task for learning. This methodology not only effectively avoids negative transfer but also enhances the algorithm's generalization ability and convergence speed. Through simulation and experiment, we demonstrate the algorithm's effectiveness.

KW - Gaussian mixture model

KW - Q-learning

KW - pursuit-evasion game

KW - recursive least squares

KW - transfer reinforcement learning

KW - value function approximation

UR - http://www.scopus.com/inward/record.url?scp=85184211323&partnerID=8YFLogxK

U2 - 10.1002/asjc.3328

DO - 10.1002/asjc.3328

M3 - 文章

AN - SCOPUS:85184211323

SN - 1561-8625

VL - 26

SP - 2125

EP - 2140

JO - Asian Journal of Control

JF - Asian Journal of Control

IS - 4

ER -

Transfer reinforcement learning for multi-agent pursuit-evasion differential game with obstacles in a continuous environment

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this