Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning

Zhaotie Hao; Bin Guo; Mengyuan Li; Lie Wu; Zhiwen Yu

doi:10.1007/978-981-99-2385-4_10

Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning

Zhaotie Hao, Bin Guo, Mengyuan Li, Lie Wu, Zhiwen Yu

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

As an intelligent device integrating a series of advanced technologies, mobile robots have been widely used in the field of defense and military affairs because of their high degree of autonomy and flexibility. They can independently track and attack dynamic targets. However, traditional tracking attack algorithms are sensitive to the changes of the external environment, and does not have mobility and expansibility, while deep reinforcement learning can adapt to different environments because of its good learning and exploration ability. In order to pursuit target accurately and robust, this paper proposes a solution based on deep reinforcement learning algorithm. In view of the low accuracy and low robustness of traditional dynamic target pursuit, this paper models the dynamic target tracking and attack problem of mobile robots as a Partially Observable Markov Decision Process (POMDP), and proposes a general-purpose end-to-end deep reinforcement learning framework based on dual agents to track and attack targets accurately in different scenarios. Aiming at the problem that it is difficult for mobile robots to accurately track targets and evade obstacles, this paper uses partial zero-sum game to improve the reward function to provide implicit guidance for attackers to pursue targets, and uses asynchronous advantage actor critic (A3C) algorithm to train models in parallel. Experiments in this paper show that the model can be transferred to different scenarios and has good generalization performance. Compared with the baseline method, the attacker’s time to successfully destroy the target is reduced by 44.7% at most in the maze scene and 40.5% at most in the block scene, which verifies the effectiveness of the proposed method. In addition, this paper analyzes the effectiveness of each structure of the model through ablation experiments, which illustrates the effectiveness and necessity of each module and provides a theoretical basis for subsequent research.

Original language	English
Title of host publication	Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers
Editors	Yuqing Sun, Tun Lu, Yinzhang Guo, Xiaoxia Song, Hongfei Fan, Dongning Liu, Liping Gao, Bowen Du
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	133-147
Number of pages	15
ISBN (Print)	9789819923847
DOIs	https://doi.org/10.1007/978-981-99-2385-4_10
State	Published - 2023
Event	17th CCF Conference on Computer Supported Cooperative Work and Social Computing, ChineseCSCW 2022 - Taiyuan, China Duration: 25 Nov 2022 → 27 Nov 2022

Publication series

Name	Communications in Computer and Information Science
Volume	1682 CCIS
ISSN (Print)	1865-0929
ISSN (Electronic)	1865-0937

Conference

Conference	17th CCF Conference on Computer Supported Cooperative Work and Social Computing, ChineseCSCW 2022
Country/Territory	China
City	Taiyuan
Period	25/11/22 → 27/11/22

Keywords

Deep Reinforcement Learning
Dual agent
Partial zero-sum game
Target pursuit

Access to Document

10.1007/978-981-99-2385-4_10

Cite this

Hao, Z., Guo, B., Li, M., Wu, L., & Yu, Z. (2023). Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning. In Y. Sun, T. Lu, Y. Guo, X. Song, H. Fan, D. Liu, L. Gao, & B. Du (Eds.), Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers (pp. 133-147). (Communications in Computer and Information Science; Vol. 1682 CCIS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-99-2385-4_10

Hao, Zhaotie ; Guo, Bin ; Li, Mengyuan et al. / Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning. Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers. editor / Yuqing Sun ; Tun Lu ; Yinzhang Guo ; Xiaoxia Song ; Hongfei Fan ; Dongning Liu ; Liping Gao ; Bowen Du. Springer Science and Business Media Deutschland GmbH, 2023. pp. 133-147 (Communications in Computer and Information Science).

@inproceedings{7533fb9a15be4268a02649f2cc6b2205,

title = "Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning",

abstract = "As an intelligent device integrating a series of advanced technologies, mobile robots have been widely used in the field of defense and military affairs because of their high degree of autonomy and flexibility. They can independently track and attack dynamic targets. However, traditional tracking attack algorithms are sensitive to the changes of the external environment, and does not have mobility and expansibility, while deep reinforcement learning can adapt to different environments because of its good learning and exploration ability. In order to pursuit target accurately and robust, this paper proposes a solution based on deep reinforcement learning algorithm. In view of the low accuracy and low robustness of traditional dynamic target pursuit, this paper models the dynamic target tracking and attack problem of mobile robots as a Partially Observable Markov Decision Process (POMDP), and proposes a general-purpose end-to-end deep reinforcement learning framework based on dual agents to track and attack targets accurately in different scenarios. Aiming at the problem that it is difficult for mobile robots to accurately track targets and evade obstacles, this paper uses partial zero-sum game to improve the reward function to provide implicit guidance for attackers to pursue targets, and uses asynchronous advantage actor critic (A3C) algorithm to train models in parallel. Experiments in this paper show that the model can be transferred to different scenarios and has good generalization performance. Compared with the baseline method, the attacker{\textquoteright}s time to successfully destroy the target is reduced by 44.7% at most in the maze scene and 40.5% at most in the block scene, which verifies the effectiveness of the proposed method. In addition, this paper analyzes the effectiveness of each structure of the model through ablation experiments, which illustrates the effectiveness and necessity of each module and provides a theoretical basis for subsequent research.",

keywords = "Deep Reinforcement Learning, Dual agent, Partial zero-sum game, Target pursuit",

author = "Zhaotie Hao and Bin Guo and Mengyuan Li and Lie Wu and Zhiwen Yu",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.; 17th CCF Conference on Computer Supported Cooperative Work and Social Computing, ChineseCSCW 2022 ; Conference date: 25-11-2022 Through 27-11-2022",

year = "2023",

doi = "10.1007/978-981-99-2385-4_10",

language = "英语",

isbn = "9789819923847",

series = "Communications in Computer and Information Science",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "133--147",

editor = "Yuqing Sun and Tun Lu and Yinzhang Guo and Xiaoxia Song and Hongfei Fan and Dongning Liu and Liping Gao and Bowen Du",

booktitle = "Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers",

}

Hao, Z, Guo, B, Li, M, Wu, L & Yu, Z 2023, Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning. in Y Sun, T Lu, Y Guo, X Song, H Fan, D Liu, L Gao & B Du (eds), Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers. Communications in Computer and Information Science, vol. 1682 CCIS, Springer Science and Business Media Deutschland GmbH, pp. 133-147, 17th CCF Conference on Computer Supported Cooperative Work and Social Computing, ChineseCSCW 2022, Taiyuan, China, 25/11/22. https://doi.org/10.1007/978-981-99-2385-4_10

Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning. / Hao, Zhaotie; Guo, Bin; Li, Mengyuan et al.
Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers. ed. / Yuqing Sun; Tun Lu; Yinzhang Guo; Xiaoxia Song; Hongfei Fan; Dongning Liu; Liping Gao; Bowen Du. Springer Science and Business Media Deutschland GmbH, 2023. p. 133-147 (Communications in Computer and Information Science; Vol. 1682 CCIS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning

AU - Hao, Zhaotie

AU - Guo, Bin

AU - Li, Mengyuan

AU - Wu, Lie

AU - Yu, Zhiwen

PY - 2023

Y1 - 2023

N2 - As an intelligent device integrating a series of advanced technologies, mobile robots have been widely used in the field of defense and military affairs because of their high degree of autonomy and flexibility. They can independently track and attack dynamic targets. However, traditional tracking attack algorithms are sensitive to the changes of the external environment, and does not have mobility and expansibility, while deep reinforcement learning can adapt to different environments because of its good learning and exploration ability. In order to pursuit target accurately and robust, this paper proposes a solution based on deep reinforcement learning algorithm. In view of the low accuracy and low robustness of traditional dynamic target pursuit, this paper models the dynamic target tracking and attack problem of mobile robots as a Partially Observable Markov Decision Process (POMDP), and proposes a general-purpose end-to-end deep reinforcement learning framework based on dual agents to track and attack targets accurately in different scenarios. Aiming at the problem that it is difficult for mobile robots to accurately track targets and evade obstacles, this paper uses partial zero-sum game to improve the reward function to provide implicit guidance for attackers to pursue targets, and uses asynchronous advantage actor critic (A3C) algorithm to train models in parallel. Experiments in this paper show that the model can be transferred to different scenarios and has good generalization performance. Compared with the baseline method, the attacker’s time to successfully destroy the target is reduced by 44.7% at most in the maze scene and 40.5% at most in the block scene, which verifies the effectiveness of the proposed method. In addition, this paper analyzes the effectiveness of each structure of the model through ablation experiments, which illustrates the effectiveness and necessity of each module and provides a theoretical basis for subsequent research.

AB - As an intelligent device integrating a series of advanced technologies, mobile robots have been widely used in the field of defense and military affairs because of their high degree of autonomy and flexibility. They can independently track and attack dynamic targets. However, traditional tracking attack algorithms are sensitive to the changes of the external environment, and does not have mobility and expansibility, while deep reinforcement learning can adapt to different environments because of its good learning and exploration ability. In order to pursuit target accurately and robust, this paper proposes a solution based on deep reinforcement learning algorithm. In view of the low accuracy and low robustness of traditional dynamic target pursuit, this paper models the dynamic target tracking and attack problem of mobile robots as a Partially Observable Markov Decision Process (POMDP), and proposes a general-purpose end-to-end deep reinforcement learning framework based on dual agents to track and attack targets accurately in different scenarios. Aiming at the problem that it is difficult for mobile robots to accurately track targets and evade obstacles, this paper uses partial zero-sum game to improve the reward function to provide implicit guidance for attackers to pursue targets, and uses asynchronous advantage actor critic (A3C) algorithm to train models in parallel. Experiments in this paper show that the model can be transferred to different scenarios and has good generalization performance. Compared with the baseline method, the attacker’s time to successfully destroy the target is reduced by 44.7% at most in the maze scene and 40.5% at most in the block scene, which verifies the effectiveness of the proposed method. In addition, this paper analyzes the effectiveness of each structure of the model through ablation experiments, which illustrates the effectiveness and necessity of each module and provides a theoretical basis for subsequent research.

KW - Deep Reinforcement Learning

KW - Dual agent

KW - Partial zero-sum game

KW - Target pursuit

UR - http://www.scopus.com/inward/record.url?scp=85161153255&partnerID=8YFLogxK

U2 - 10.1007/978-981-99-2385-4_10

DO - 10.1007/978-981-99-2385-4_10

M3 - 会议稿件

AN - SCOPUS:85161153255

SN - 9789819923847

T3 - Communications in Computer and Information Science

SP - 133

EP - 147

BT - Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers

A2 - Sun, Yuqing

A2 - Lu, Tun

A2 - Guo, Yinzhang

A2 - Song, Xiaoxia

A2 - Fan, Hongfei

A2 - Liu, Dongning

A2 - Gao, Liping

A2 - Du, Bowen

PB - Springer Science and Business Media Deutschland GmbH

T2 - 17th CCF Conference on Computer Supported Cooperative Work and Social Computing, ChineseCSCW 2022

Y2 - 25 November 2022 through 27 November 2022

ER -

Hao Z, Guo B, Li M, Wu L, Yu Z. Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning. In Sun Y, Lu T, Guo Y, Song X, Fan H, Liu D, Gao L, Du B, editors, Computer Supported Cooperative Work and Social Computing - 17th CCF Conference, ChineseCSCW 2022, Revised Selected Papers. Springer Science and Business Media Deutschland GmbH. 2023. p. 133-147. (Communications in Computer and Information Science). doi: 10.1007/978-981-99-2385-4_10

Scene Adaptive Persistent Target Tracking and Attack Method Based on Deep Reinforcement Learning

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this