Hierarchical Optimization Design for Autonomous Flight of Vision-Based Quadrotor Using Reinforcement Learning

Quan Yong Fan; Jiaxuan Li; Tianxin Liu; Bin Xu

doi:10.1109/TCYB.2025.3562800

Hierarchical Optimization Design for Autonomous Flight of Vision-Based Quadrotor Using Reinforcement Learning

Quan Yong Fan, Jiaxuan Li, Tianxin Liu, Bin Xu

自动化学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Although quadrotor has been widely used in practical engineering, its autonomous flight ability needs to be improved in complex operating environment. The autonomous flight problem for quadrotor with monocular vision is investigated, which is divided into control layer and decision layer in this article. Reinforcement learning method is utilized for hierarchical optimization to ensure that quadrotor completes narrow space traversal tasks safely and efficiently. First, considering the dynamic characteristics of the quadrotor with the motor speed as the control input, a parallel policy iteration algorithm is designed for the nonaffine nonlinear system, and the proposed controller can be learned online to improve the fundamental control performance. On this basis, the autonomous decision problem with visual information as input is modeled as a Markov decision process, and a curriculum learning mechanism is introduced to overcome the difficulties caused by sparse reward. At the same time, the clipping function is optimized to improve the learning efficiency of proximal policy optimization (PPO) algorithm for autonomous flight capabilities. Finally, the effectiveness of the proposed intelligent control and decision methods are verified through simulation.

源语言	英语
期刊	IEEE Transactions on Cybernetics
DOI	https://doi.org/10.1109/TCYB.2025.3562800
出版状态	已接受/待刊 - 2025

访问文件

10.1109/TCYB.2025.3562800

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{51d24013c1ed43428cc37ff0923a9aa1,

title = "Hierarchical Optimization Design for Autonomous Flight of Vision-Based Quadrotor Using Reinforcement Learning",

abstract = "Although quadrotor has been widely used in practical engineering, its autonomous flight ability needs to be improved in complex operating environment. The autonomous flight problem for quadrotor with monocular vision is investigated, which is divided into control layer and decision layer in this article. Reinforcement learning method is utilized for hierarchical optimization to ensure that quadrotor completes narrow space traversal tasks safely and efficiently. First, considering the dynamic characteristics of the quadrotor with the motor speed as the control input, a parallel policy iteration algorithm is designed for the nonaffine nonlinear system, and the proposed controller can be learned online to improve the fundamental control performance. On this basis, the autonomous decision problem with visual information as input is modeled as a Markov decision process, and a curriculum learning mechanism is introduced to overcome the difficulties caused by sparse reward. At the same time, the clipping function is optimized to improve the learning efficiency of proximal policy optimization (PPO) algorithm for autonomous flight capabilities. Finally, the effectiveness of the proposed intelligent control and decision methods are verified through simulation.",

keywords = "Autonomous flight, off-policy, optimal control, quadrotor, reinforcement learning",

author = "Fan, {Quan Yong} and Jiaxuan Li and Tianxin Liu and Bin Xu",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2025",

doi = "10.1109/TCYB.2025.3562800",

language = "英语",

journal = "IEEE Transactions on Cybernetics",

issn = "2168-2267",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Hierarchical Optimization Design for Autonomous Flight of Vision-Based Quadrotor Using Reinforcement Learning

AU - Fan, Quan Yong

AU - Li, Jiaxuan

AU - Liu, Tianxin

AU - Xu, Bin

PY - 2025

Y1 - 2025

N2 - Although quadrotor has been widely used in practical engineering, its autonomous flight ability needs to be improved in complex operating environment. The autonomous flight problem for quadrotor with monocular vision is investigated, which is divided into control layer and decision layer in this article. Reinforcement learning method is utilized for hierarchical optimization to ensure that quadrotor completes narrow space traversal tasks safely and efficiently. First, considering the dynamic characteristics of the quadrotor with the motor speed as the control input, a parallel policy iteration algorithm is designed for the nonaffine nonlinear system, and the proposed controller can be learned online to improve the fundamental control performance. On this basis, the autonomous decision problem with visual information as input is modeled as a Markov decision process, and a curriculum learning mechanism is introduced to overcome the difficulties caused by sparse reward. At the same time, the clipping function is optimized to improve the learning efficiency of proximal policy optimization (PPO) algorithm for autonomous flight capabilities. Finally, the effectiveness of the proposed intelligent control and decision methods are verified through simulation.

AB - Although quadrotor has been widely used in practical engineering, its autonomous flight ability needs to be improved in complex operating environment. The autonomous flight problem for quadrotor with monocular vision is investigated, which is divided into control layer and decision layer in this article. Reinforcement learning method is utilized for hierarchical optimization to ensure that quadrotor completes narrow space traversal tasks safely and efficiently. First, considering the dynamic characteristics of the quadrotor with the motor speed as the control input, a parallel policy iteration algorithm is designed for the nonaffine nonlinear system, and the proposed controller can be learned online to improve the fundamental control performance. On this basis, the autonomous decision problem with visual information as input is modeled as a Markov decision process, and a curriculum learning mechanism is introduced to overcome the difficulties caused by sparse reward. At the same time, the clipping function is optimized to improve the learning efficiency of proximal policy optimization (PPO) algorithm for autonomous flight capabilities. Finally, the effectiveness of the proposed intelligent control and decision methods are verified through simulation.

KW - Autonomous flight

KW - off-policy

KW - optimal control

KW - quadrotor

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=105004324146&partnerID=8YFLogxK

U2 - 10.1109/TCYB.2025.3562800

DO - 10.1109/TCYB.2025.3562800

M3 - 文章

AN - SCOPUS:105004324146

SN - 2168-2267

JO - IEEE Transactions on Cybernetics

JF - IEEE Transactions on Cybernetics

ER -

Hierarchical Optimization Design for Autonomous Flight of Vision-Based Quadrotor Using Reinforcement Learning

摘要

访问文件

其它文件与链接

指纹

引用此