Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning

Bo Li; Zhigang Gan; Daqing Chen; Dyachenko Sergey Aleksandrovich

doi:10.3390/rs12223789

Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning

Bo Li, Zhigang Gan, Daqing Chen, Dyachenko Sergey Aleksandrovich

School of Electronics and Information

Research output: Contribution to journal › Article › peer-review

77 Scopus citations

Abstract

This paper combines deep reinforcement learning (DRL) with meta-learning and proposes a novel approach, named meta twin delayed deep deterministic policy gradient (Meta-TD3), to realize the control of unmanned aerial vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider a multi-task experience replay buffer to provide data for the multi-task learning of the DRL algorithm, and we combine meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, namely the deep deterministic policy gradient (DDPG) and twin delayed deep deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.

Original language	English
Article number	3789
Pages (from-to)	1-20
Number of pages	20
Journal	Remote Sensing
Volume	12
Issue number	22
DOIs	https://doi.org/10.3390/rs12223789
State	Published - 2 Nov 2020

Keywords

Deep reinforcement learning
Maneuvering target tracking
Meta-learning
Multi-tasks
UAV

Access to Document

10.3390/rs12223789

Cite this

@article{cc76692b62cb4dcfb7a87ddcfe495eee,

title = "Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning",

abstract = "This paper combines deep reinforcement learning (DRL) with meta-learning and proposes a novel approach, named meta twin delayed deep deterministic policy gradient (Meta-TD3), to realize the control of unmanned aerial vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider a multi-task experience replay buffer to provide data for the multi-task learning of the DRL algorithm, and we combine meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, namely the deep deterministic policy gradient (DDPG) and twin delayed deep deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.",

keywords = "Deep reinforcement learning, Maneuvering target tracking, Meta-learning, Multi-tasks, UAV",

author = "Bo Li and Zhigang Gan and Daqing Chen and Aleksandrovich, {Dyachenko Sergey}",

note = "Publisher Copyright: {\textcopyright} 2020 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2020",

month = nov,

day = "2",

doi = "10.3390/rs12223789",

language = "英语",

volume = "12",

pages = "1--20",

journal = "Remote Sensing",

issn = "2072-4292",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "22",

}

TY - JOUR

T1 - Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning

AU - Li, Bo

AU - Gan, Zhigang

AU - Chen, Daqing

AU - Aleksandrovich, Dyachenko Sergey

PY - 2020/11/2

Y1 - 2020/11/2

N2 - This paper combines deep reinforcement learning (DRL) with meta-learning and proposes a novel approach, named meta twin delayed deep deterministic policy gradient (Meta-TD3), to realize the control of unmanned aerial vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider a multi-task experience replay buffer to provide data for the multi-task learning of the DRL algorithm, and we combine meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, namely the deep deterministic policy gradient (DDPG) and twin delayed deep deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.

AB - This paper combines deep reinforcement learning (DRL) with meta-learning and proposes a novel approach, named meta twin delayed deep deterministic policy gradient (Meta-TD3), to realize the control of unmanned aerial vehicle (UAV), allowing a UAV to quickly track a target in an environment where the motion of a target is uncertain. This approach can be applied to a variety of scenarios, such as wildlife protection, emergency aid, and remote sensing. We consider a multi-task experience replay buffer to provide data for the multi-task learning of the DRL algorithm, and we combine meta-learning to develop a multi-task reinforcement learning update method to ensure the generalization capability of reinforcement learning. Compared with the state-of-the-art algorithms, namely the deep deterministic policy gradient (DDPG) and twin delayed deep deterministic policy gradient (TD3), experimental results show that the Meta-TD3 algorithm has achieved a great improvement in terms of both convergence value and convergence rate. In a UAV target tracking problem, Meta-TD3 only requires a few steps to train to enable a UAV to adapt quickly to a new target movement mode more and maintain a better tracking effectiveness.

KW - Deep reinforcement learning

KW - Maneuvering target tracking

KW - Meta-learning

KW - Multi-tasks

KW - UAV

UR - http://www.scopus.com/inward/record.url?scp=85096213529&partnerID=8YFLogxK

U2 - 10.3390/rs12223789

DO - 10.3390/rs12223789

M3 - 文章

AN - SCOPUS:85096213529

SN - 2072-4292

VL - 12

SP - 1

EP - 20

JO - Remote Sensing

JF - Remote Sensing

IS - 22

M1 - 3789

ER -

Uav maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this