Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning

Kaifang Wan; Xiaoguang Gao; Zijian Hu; Gaofeng Wu

doi:10.3390/rs12040640

Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning

Kaifang Wan, Xiaoguang Gao, Zijian Hu, Gaofeng Wu

School of Electronics and Information

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

80 Scopus citations

Abstract

In this paper, a novel deep reinforcement learning (DRL) method, and robust deep deterministic policy gradient (Robust-DDPG), is proposed for developing a controller that allows robust flying of an unmanned aerial vehicle (UAV) in dynamic uncertain environments. This technique is applicable in many fields, such as penetration and remote surveillance. The learning-based controller is constructed with an actor-critic framework, and can perform a dual-channel continuous control (roll and speed) of the UAV. To overcome the fragility and volatility of original DDPG, three critical learning tricks are introduced in Robust-DDPG: (1) Delayed-learning trick, providing stable learnings, while facing dynamic environments; (2) adversarial attack trick, improving policy's adaptability to uncertain environments; (3) mixed exploration trick, enabling faster convergence of the model. The training experiments show great improvement in its convergence speed, convergence effect, and stability. The exploiting experiments demonstrate high efficiency in providing the UAV a shorter and smoother path. While, the generalization experiments verify its better adaptability to complicated, dynamic and uncertain environments, comparing to Deep Q Network (DQN) and DDPG algorithms.

Original language	English
Article number	640
Journal	Remote Sensing
Volume	12
Issue number	4
DOIs	https://doi.org/10.3390/rs12040640
State	Published - 1 Feb 2020

Keywords

Adversarial attack
Deep reinforcement learning
Delayed learning
Mixed exploration
Robust motion control
UAV

Access to Document

10.3390/rs12040640

Cite this

@article{8378b18675984de7950f7fe04e192000,

title = "Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning",

abstract = "In this paper, a novel deep reinforcement learning (DRL) method, and robust deep deterministic policy gradient (Robust-DDPG), is proposed for developing a controller that allows robust flying of an unmanned aerial vehicle (UAV) in dynamic uncertain environments. This technique is applicable in many fields, such as penetration and remote surveillance. The learning-based controller is constructed with an actor-critic framework, and can perform a dual-channel continuous control (roll and speed) of the UAV. To overcome the fragility and volatility of original DDPG, three critical learning tricks are introduced in Robust-DDPG: (1) Delayed-learning trick, providing stable learnings, while facing dynamic environments; (2) adversarial attack trick, improving policy's adaptability to uncertain environments; (3) mixed exploration trick, enabling faster convergence of the model. The training experiments show great improvement in its convergence speed, convergence effect, and stability. The exploiting experiments demonstrate high efficiency in providing the UAV a shorter and smoother path. While, the generalization experiments verify its better adaptability to complicated, dynamic and uncertain environments, comparing to Deep Q Network (DQN) and DDPG algorithms.",

keywords = "Adversarial attack, Deep reinforcement learning, Delayed learning, Mixed exploration, Robust motion control, UAV",

author = "Kaifang Wan and Xiaoguang Gao and Zijian Hu and Gaofeng Wu",

note = "Publisher Copyright: {\textcopyright} 2020 by the author.",

year = "2020",

month = feb,

day = "1",

doi = "10.3390/rs12040640",

language = "英语",

volume = "12",

journal = "Remote Sensing",

issn = "2072-4292",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "4",

}

TY - JOUR

T1 - Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning

AU - Wan, Kaifang

AU - Gao, Xiaoguang

AU - Hu, Zijian

AU - Wu, Gaofeng

PY - 2020/2/1

Y1 - 2020/2/1

N2 - In this paper, a novel deep reinforcement learning (DRL) method, and robust deep deterministic policy gradient (Robust-DDPG), is proposed for developing a controller that allows robust flying of an unmanned aerial vehicle (UAV) in dynamic uncertain environments. This technique is applicable in many fields, such as penetration and remote surveillance. The learning-based controller is constructed with an actor-critic framework, and can perform a dual-channel continuous control (roll and speed) of the UAV. To overcome the fragility and volatility of original DDPG, three critical learning tricks are introduced in Robust-DDPG: (1) Delayed-learning trick, providing stable learnings, while facing dynamic environments; (2) adversarial attack trick, improving policy's adaptability to uncertain environments; (3) mixed exploration trick, enabling faster convergence of the model. The training experiments show great improvement in its convergence speed, convergence effect, and stability. The exploiting experiments demonstrate high efficiency in providing the UAV a shorter and smoother path. While, the generalization experiments verify its better adaptability to complicated, dynamic and uncertain environments, comparing to Deep Q Network (DQN) and DDPG algorithms.

AB - In this paper, a novel deep reinforcement learning (DRL) method, and robust deep deterministic policy gradient (Robust-DDPG), is proposed for developing a controller that allows robust flying of an unmanned aerial vehicle (UAV) in dynamic uncertain environments. This technique is applicable in many fields, such as penetration and remote surveillance. The learning-based controller is constructed with an actor-critic framework, and can perform a dual-channel continuous control (roll and speed) of the UAV. To overcome the fragility and volatility of original DDPG, three critical learning tricks are introduced in Robust-DDPG: (1) Delayed-learning trick, providing stable learnings, while facing dynamic environments; (2) adversarial attack trick, improving policy's adaptability to uncertain environments; (3) mixed exploration trick, enabling faster convergence of the model. The training experiments show great improvement in its convergence speed, convergence effect, and stability. The exploiting experiments demonstrate high efficiency in providing the UAV a shorter and smoother path. While, the generalization experiments verify its better adaptability to complicated, dynamic and uncertain environments, comparing to Deep Q Network (DQN) and DDPG algorithms.

KW - Adversarial attack

KW - Deep reinforcement learning

KW - Delayed learning

KW - Mixed exploration

KW - Robust motion control

KW - UAV

UR - http://www.scopus.com/inward/record.url?scp=85080867635&partnerID=8YFLogxK

U2 - 10.3390/rs12040640

DO - 10.3390/rs12040640

M3 - 文章

AN - SCOPUS:85080867635

SN - 2072-4292

VL - 12

JO - Remote Sensing

JF - Remote Sensing

IS - 4

M1 - 640

ER -

Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this