TY - JOUR
T1 - Application of Reinforcement Learning in Deep-Stall Recovery
AU - Ming, Ruichen
AU - Liu, Xiao Xiong
AU - Xu, Xinlong
AU - Li, Yu
AU - Zhang, Weiguo
N1 - Publisher Copyright:
© 1965-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - The aircraft deep-stall phenomenon is caused by an angle of attack that stabilizes at an equilibrium point of high angle of attack. This excessive angle of attack leads to a reduction in the lift, as well as a decrease in the elevator efficiency, making it difficult to recover the aircraft out of this very dangerous flight condition. Reinforcement learning methods offer a design approach for such complex nonlinear control problems. However, during deep-stall recovery tasks, the nonlinearity of the aircraft model is high, and the control efficiency is substantially reduced, hence limiting the application of direct reinforcement learning methods. To address this problem, we conduct bifurcation and phase plane analyses on the deep-stall model of the aircraft, and use the results as domain knowledge to construct the reward function. Then, we apply the proximal policy optimization algorithm to the deep-stall strategy. Finally, in the simulation, we compare the method with feedback shaping with the reinforcement learning method without feedback shaping. The simulation results indicate that although the former method recovers the aircraft at the angle of attack, its uncontrollable state renders this an unsuccessful recovery. Meanwhile, the proposed method stably performs deep-stall recovery tasks through a loop maneuver.
AB - The aircraft deep-stall phenomenon is caused by an angle of attack that stabilizes at an equilibrium point of high angle of attack. This excessive angle of attack leads to a reduction in the lift, as well as a decrease in the elevator efficiency, making it difficult to recover the aircraft out of this very dangerous flight condition. Reinforcement learning methods offer a design approach for such complex nonlinear control problems. However, during deep-stall recovery tasks, the nonlinearity of the aircraft model is high, and the control efficiency is substantially reduced, hence limiting the application of direct reinforcement learning methods. To address this problem, we conduct bifurcation and phase plane analyses on the deep-stall model of the aircraft, and use the results as domain knowledge to construct the reward function. Then, we apply the proximal policy optimization algorithm to the deep-stall strategy. Finally, in the simulation, we compare the method with feedback shaping with the reinforcement learning method without feedback shaping. The simulation results indicate that although the former method recovers the aircraft at the angle of attack, its uncontrollable state renders this an unsuccessful recovery. Meanwhile, the proposed method stably performs deep-stall recovery tasks through a loop maneuver.
KW - bifurcation analysis
KW - deep-stall recovery
KW - phase portrait analysis
KW - reinforcement learning
KW - reward shaping
UR - http://www.scopus.com/inward/record.url?scp=105003588897&partnerID=8YFLogxK
U2 - 10.1109/TAES.2025.3561100
DO - 10.1109/TAES.2025.3561100
M3 - 文章
AN - SCOPUS:105003588897
SN - 0018-9251
JO - IEEE Transactions on Aerospace and Electronic Systems
JF - IEEE Transactions on Aerospace and Electronic Systems
ER -