TY - JOUR
T1 - Deep Reinforcement Learning of UAV Tracking Control under Wind Disturbances Environments
AU - Ma, Bodi
AU - Liu, Zhenbao
AU - Dang, Qingqing
AU - Zhao, Wen
AU - Wang, Jingyan
AU - Cheng, Yao
AU - Yuan, Zhirong
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2023
Y1 - 2023
N2 - Aiming at the problems of strong nonlinearity, strong coupling, and unknown interference encountered in the flight control process of unmanned aerial vehicles (UAVs) in a complex dynamic environment and reinforcement-learning-based algorithm generalization, this study presents an innovative incremental reinforcement-learning-based algorithm for UAV tracking control in a dynamic environment. The main goal is to make a UAV able to adjust its control policy in a dynamic environment. The UAV tracking control task is transformed into a Markov decision process (MDP) and further investigated using an incremental reinforcement-learning-based method. First, a policy relief (PR) method is used to make UAVs capable of performing an appropriate exploration in a new environment. In this way, a UAV controller can mitigate the conflict between a new environment and the current knowledge to ensure better adaptability to a dynamic environment. In addition, a significance weighting (SW) method is developed to improve the utilization of episodes with higher importance and richer information. In the proposed method, learning episodes that include more useful information are assigned with higher importance weights. The numerical simulation, hardware-in-the-loop (HITL) experiments, and real-world flight experiments are conducted to evaluate the performance of the proposed method. The results demonstrate high accuracy and effectiveness and good robustness of the proposed control algorithm in a dynamic flight environment.
AB - Aiming at the problems of strong nonlinearity, strong coupling, and unknown interference encountered in the flight control process of unmanned aerial vehicles (UAVs) in a complex dynamic environment and reinforcement-learning-based algorithm generalization, this study presents an innovative incremental reinforcement-learning-based algorithm for UAV tracking control in a dynamic environment. The main goal is to make a UAV able to adjust its control policy in a dynamic environment. The UAV tracking control task is transformed into a Markov decision process (MDP) and further investigated using an incremental reinforcement-learning-based method. First, a policy relief (PR) method is used to make UAVs capable of performing an appropriate exploration in a new environment. In this way, a UAV controller can mitigate the conflict between a new environment and the current knowledge to ensure better adaptability to a dynamic environment. In addition, a significance weighting (SW) method is developed to improve the utilization of episodes with higher importance and richer information. In the proposed method, learning episodes that include more useful information are assigned with higher importance weights. The numerical simulation, hardware-in-the-loop (HITL) experiments, and real-world flight experiments are conducted to evaluate the performance of the proposed method. The results demonstrate high accuracy and effectiveness and good robustness of the proposed control algorithm in a dynamic flight environment.
KW - Dynamic environment
KW - reinforcement learning
KW - tracking control
KW - unmanned aerial vehicles (UAVs)
KW - wind disturbances
UR - http://www.scopus.com/inward/record.url?scp=85153334461&partnerID=8YFLogxK
U2 - 10.1109/TIM.2023.3265741
DO - 10.1109/TIM.2023.3265741
M3 - 文章
AN - SCOPUS:85153334461
SN - 0018-9456
VL - 72
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 2510913
ER -