Deep Reinforcement Learning of UAV Tracking Control under Wind Disturbances Environments

Bodi Ma; Zhenbao Liu; Qingqing Dang; Wen Zhao; Jingyan Wang; Yao Cheng; Zhirong Yuan

doi:10.1109/TIM.2023.3265741

Deep Reinforcement Learning of UAV Tracking Control under Wind Disturbances Environments

Bodi Ma, Zhenbao Liu, Qingqing Dang, Wen Zhao, Jingyan Wang, Yao Cheng, Zhirong Yuan

School of Civil Aviation

Research output: Contribution to journal › Article › peer-review

80 Scopus citations

Abstract

Aiming at the problems of strong nonlinearity, strong coupling, and unknown interference encountered in the flight control process of unmanned aerial vehicles (UAVs) in a complex dynamic environment and reinforcement-learning-based algorithm generalization, this study presents an innovative incremental reinforcement-learning-based algorithm for UAV tracking control in a dynamic environment. The main goal is to make a UAV able to adjust its control policy in a dynamic environment. The UAV tracking control task is transformed into a Markov decision process (MDP) and further investigated using an incremental reinforcement-learning-based method. First, a policy relief (PR) method is used to make UAVs capable of performing an appropriate exploration in a new environment. In this way, a UAV controller can mitigate the conflict between a new environment and the current knowledge to ensure better adaptability to a dynamic environment. In addition, a significance weighting (SW) method is developed to improve the utilization of episodes with higher importance and richer information. In the proposed method, learning episodes that include more useful information are assigned with higher importance weights. The numerical simulation, hardware-in-the-loop (HITL) experiments, and real-world flight experiments are conducted to evaluate the performance of the proposed method. The results demonstrate high accuracy and effectiveness and good robustness of the proposed control algorithm in a dynamic flight environment.

Original language	English
Article number	2510913
Journal	IEEE Transactions on Instrumentation and Measurement
Volume	72
DOIs	https://doi.org/10.1109/TIM.2023.3265741
State	Published - 2023

Keywords

Dynamic environment
reinforcement learning
tracking control
unmanned aerial vehicles (UAVs)
wind disturbances

Access to Document

10.1109/TIM.2023.3265741

Cite this

@article{8e38db9b422344e5a79070a61caabdae,

title = "Deep Reinforcement Learning of UAV Tracking Control under Wind Disturbances Environments",

abstract = "Aiming at the problems of strong nonlinearity, strong coupling, and unknown interference encountered in the flight control process of unmanned aerial vehicles (UAVs) in a complex dynamic environment and reinforcement-learning-based algorithm generalization, this study presents an innovative incremental reinforcement-learning-based algorithm for UAV tracking control in a dynamic environment. The main goal is to make a UAV able to adjust its control policy in a dynamic environment. The UAV tracking control task is transformed into a Markov decision process (MDP) and further investigated using an incremental reinforcement-learning-based method. First, a policy relief (PR) method is used to make UAVs capable of performing an appropriate exploration in a new environment. In this way, a UAV controller can mitigate the conflict between a new environment and the current knowledge to ensure better adaptability to a dynamic environment. In addition, a significance weighting (SW) method is developed to improve the utilization of episodes with higher importance and richer information. In the proposed method, learning episodes that include more useful information are assigned with higher importance weights. The numerical simulation, hardware-in-the-loop (HITL) experiments, and real-world flight experiments are conducted to evaluate the performance of the proposed method. The results demonstrate high accuracy and effectiveness and good robustness of the proposed control algorithm in a dynamic flight environment.",

keywords = "Dynamic environment, reinforcement learning, tracking control, unmanned aerial vehicles (UAVs), wind disturbances",

author = "Bodi Ma and Zhenbao Liu and Qingqing Dang and Wen Zhao and Jingyan Wang and Yao Cheng and Zhirong Yuan",

note = "Publisher Copyright: {\textcopyright} 1963-2012 IEEE.",

year = "2023",

doi = "10.1109/TIM.2023.3265741",

language = "英语",

volume = "72",

journal = "IEEE Transactions on Instrumentation and Measurement",

issn = "0018-9456",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Deep Reinforcement Learning of UAV Tracking Control under Wind Disturbances Environments

AU - Ma, Bodi

AU - Liu, Zhenbao

AU - Dang, Qingqing

AU - Zhao, Wen

AU - Wang, Jingyan

AU - Cheng, Yao

AU - Yuan, Zhirong

PY - 2023

Y1 - 2023

N2 - Aiming at the problems of strong nonlinearity, strong coupling, and unknown interference encountered in the flight control process of unmanned aerial vehicles (UAVs) in a complex dynamic environment and reinforcement-learning-based algorithm generalization, this study presents an innovative incremental reinforcement-learning-based algorithm for UAV tracking control in a dynamic environment. The main goal is to make a UAV able to adjust its control policy in a dynamic environment. The UAV tracking control task is transformed into a Markov decision process (MDP) and further investigated using an incremental reinforcement-learning-based method. First, a policy relief (PR) method is used to make UAVs capable of performing an appropriate exploration in a new environment. In this way, a UAV controller can mitigate the conflict between a new environment and the current knowledge to ensure better adaptability to a dynamic environment. In addition, a significance weighting (SW) method is developed to improve the utilization of episodes with higher importance and richer information. In the proposed method, learning episodes that include more useful information are assigned with higher importance weights. The numerical simulation, hardware-in-the-loop (HITL) experiments, and real-world flight experiments are conducted to evaluate the performance of the proposed method. The results demonstrate high accuracy and effectiveness and good robustness of the proposed control algorithm in a dynamic flight environment.

AB - Aiming at the problems of strong nonlinearity, strong coupling, and unknown interference encountered in the flight control process of unmanned aerial vehicles (UAVs) in a complex dynamic environment and reinforcement-learning-based algorithm generalization, this study presents an innovative incremental reinforcement-learning-based algorithm for UAV tracking control in a dynamic environment. The main goal is to make a UAV able to adjust its control policy in a dynamic environment. The UAV tracking control task is transformed into a Markov decision process (MDP) and further investigated using an incremental reinforcement-learning-based method. First, a policy relief (PR) method is used to make UAVs capable of performing an appropriate exploration in a new environment. In this way, a UAV controller can mitigate the conflict between a new environment and the current knowledge to ensure better adaptability to a dynamic environment. In addition, a significance weighting (SW) method is developed to improve the utilization of episodes with higher importance and richer information. In the proposed method, learning episodes that include more useful information are assigned with higher importance weights. The numerical simulation, hardware-in-the-loop (HITL) experiments, and real-world flight experiments are conducted to evaluate the performance of the proposed method. The results demonstrate high accuracy and effectiveness and good robustness of the proposed control algorithm in a dynamic flight environment.

KW - Dynamic environment

KW - reinforcement learning

KW - tracking control

KW - unmanned aerial vehicles (UAVs)

KW - wind disturbances

UR - http://www.scopus.com/inward/record.url?scp=85153334461&partnerID=8YFLogxK

U2 - 10.1109/TIM.2023.3265741

DO - 10.1109/TIM.2023.3265741

M3 - 文章

AN - SCOPUS:85153334461

SN - 0018-9456

VL - 72

JO - IEEE Transactions on Instrumentation and Measurement

JF - IEEE Transactions on Instrumentation and Measurement

M1 - 2510913

ER -

Deep Reinforcement Learning of UAV Tracking Control under Wind Disturbances Environments

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this