TY - JOUR
T1 - Target Tracking Control of UAV Through Deep Reinforcement Learning
AU - Ma, Bodi
AU - Liu, Zhenbao
AU - Zhao, Wen
AU - Yuan, Jinbiao
AU - Long, Hao
AU - Wang, Xiao
AU - Yuan, Zhirong
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2023/6/1
Y1 - 2023/6/1
N2 - This study presents an innovative reinforcement-learning-based control algorithm for a vertical take-off and landing (VTOL) aircraft under wind disturbances. In our approach, the tracking control problem of the VTOL aircraft is formulated as a Markov decision process, and the appropriate system state, reward function, and soft update method are presented. To improve the control accuracy under wind disturbances, three kinds of wind fields were added in the learning environment to expand the exploration space and simulate the effect of wind disturbances on the flight control. Moreover, to ensure the tracking accuracy and the practical implementation, a quantum-inspired experience replay strategy was developed based on quantum computation theory. In this strategy, the preparation operation scheme was designed to encourage the exploration and speed up the convergence. The depreciation operation method was developed to enrich the sample diversity, which increased the robustness of the controller and allowed the control strategy learned in the numerical simulations to be directly transferred into real-world environments. Numerical simulations, hardware-in-the-loop experiments, and real-world flight experiments were conducted to evaluate the performance and merits of the proposed method. The results demonstrated high accuracy and effectiveness and good robustness of the proposed control algorithm in terms of standoff target tracking control and flight stability.
AB - This study presents an innovative reinforcement-learning-based control algorithm for a vertical take-off and landing (VTOL) aircraft under wind disturbances. In our approach, the tracking control problem of the VTOL aircraft is formulated as a Markov decision process, and the appropriate system state, reward function, and soft update method are presented. To improve the control accuracy under wind disturbances, three kinds of wind fields were added in the learning environment to expand the exploration space and simulate the effect of wind disturbances on the flight control. Moreover, to ensure the tracking accuracy and the practical implementation, a quantum-inspired experience replay strategy was developed based on quantum computation theory. In this strategy, the preparation operation scheme was designed to encourage the exploration and speed up the convergence. The depreciation operation method was developed to enrich the sample diversity, which increased the robustness of the controller and allowed the control strategy learned in the numerical simulations to be directly transferred into real-world environments. Numerical simulations, hardware-in-the-loop experiments, and real-world flight experiments were conducted to evaluate the performance and merits of the proposed method. The results demonstrated high accuracy and effectiveness and good robustness of the proposed control algorithm in terms of standoff target tracking control and flight stability.
KW - Unmanned aerial vehicles
KW - intelligent control system
KW - reinforcement learning
KW - target tracking control
UR - http://www.scopus.com/inward/record.url?scp=85149845072&partnerID=8YFLogxK
U2 - 10.1109/TITS.2023.3249900
DO - 10.1109/TITS.2023.3249900
M3 - 文章
AN - SCOPUS:85149845072
SN - 1524-9050
VL - 24
SP - 5983
EP - 6000
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
IS - 6
ER -