基于 SAC 的无人机自主导航方法研究

Kai Kou; Gang Yang; Wenqi Zhang; Xincheng Liu; Yuan Yao; Xingshe Zhou

doi:10.1051/jnwpu/20244220310

基于 SAC 的无人机自主导航方法研究

Kai Kou, Gang Yang, Wenqi Zhang, Xincheng Liu, Yuan Yao, Xingshe Zhou

计算机学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

1 引用（Scopus）

摘要

The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.

投稿的翻译标题	Exploring UAV autonomous navigation algorithm based on soft actor-critic
源语言	繁体中文
页（从-至）	310-318
页数	9
期刊	Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
卷	42
期	2
DOI	https://doi.org/10.1051/jnwpu/20244220310
出版状态	已出版 - 4月 2024

关键词

autonomous navigation
deep reinforced learning
soft actor-critic
unmanned aerial vehicle

访问文件

10.1051/jnwpu/20244220310

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b6de6d07aead4378b8de58e75fb3da89,

title = "基于 SAC 的无人机自主导航方法研究",

abstract = "The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.",

keywords = "autonomous navigation, deep reinforced learning, soft actor-critic, unmanned aerial vehicle",

author = "Kai Kou and Gang Yang and Wenqi Zhang and Xincheng Liu and Yuan Yao and Xingshe Zhou",

note = "Publisher Copyright: {\textcopyright}2024 Journal of Northwestern Polytechnical University.",

year = "2024",

month = apr,

doi = "10.1051/jnwpu/20244220310",

language = "繁体中文",

volume = "42",

pages = "310--318",

journal = "Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University",

issn = "1000-2758",

publisher = "Northwestern Polytechnical University",

number = "2",

}

TY - JOUR

T1 - 基于 SAC 的无人机自主导航方法研究

AU - Kou, Kai

AU - Yang, Gang

AU - Zhang, Wenqi

AU - Liu, Xincheng

AU - Yao, Yuan

AU - Zhou, Xingshe

PY - 2024/4

Y1 - 2024/4

N2 - The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.

AB - The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.

KW - autonomous navigation

KW - deep reinforced learning

KW - soft actor-critic

KW - unmanned aerial vehicle

UR - http://www.scopus.com/inward/record.url?scp=85193525713&partnerID=8YFLogxK

U2 - 10.1051/jnwpu/20244220310

DO - 10.1051/jnwpu/20244220310

M3 - 文章

AN - SCOPUS:85193525713

SN - 1000-2758

VL - 42

SP - 310

EP - 318

JO - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

JF - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

IS - 2

ER -

基于 SAC 的无人机自主导航方法研究

摘要

关键词

访问文件

其它文件与链接

指纹

引用此