基于 SAC 的无人机自主导航方法研究

Kai Kou; Gang Yang; Wenqi Zhang; Xincheng Liu; Yuan Yao; Xingshe Zhou

doi:10.1051/jnwpu/20244220310

基于 SAC 的无人机自主导航方法研究

Translated title of the contribution: Exploring UAV autonomous navigation algorithm based on soft actor-critic

Kai Kou, Gang Yang, Wenqi Zhang, Xincheng Liu, Yuan Yao, Xingshe Zhou

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

Abstract

The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.

Translated title of the contribution	Exploring UAV autonomous navigation algorithm based on soft actor-critic
Original language	Chinese (Traditional)
Pages (from-to)	310-318
Number of pages	9
Journal	Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
Volume	42
Issue number	2
DOIs	https://doi.org/10.1051/jnwpu/20244220310
State	Published - Apr 2024

Access to Document

10.1051/jnwpu/20244220310

Cite this

@article{b6de6d07aead4378b8de58e75fb3da89,

title = "基于 SAC 的无人机自主导航方法研究",

abstract = "The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.",

keywords = "autonomous navigation, deep reinforced learning, soft actor-critic, unmanned aerial vehicle",

author = "Kai Kou and Gang Yang and Wenqi Zhang and Xincheng Liu and Yuan Yao and Xingshe Zhou",

note = "Publisher Copyright: {\textcopyright}2024 Journal of Northwestern Polytechnical University.",

year = "2024",

month = apr,

doi = "10.1051/jnwpu/20244220310",

language = "繁体中文",

volume = "42",

pages = "310--318",

journal = "Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University",

issn = "1000-2758",

publisher = "Northwestern Polytechnical University",

number = "2",

}

TY - JOUR

T1 - 基于 SAC 的无人机自主导航方法研究

AU - Kou, Kai

AU - Yang, Gang

AU - Zhang, Wenqi

AU - Liu, Xincheng

AU - Yao, Yuan

AU - Zhou, Xingshe

PY - 2024/4

Y1 - 2024/4

N2 - The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.

AB - The existing deep reinforced learning algorithms cannot see local environments and have insufficient perceptual information on UAV autonomous navigation tasks. The paper investigates the UAV′s autonomous navigation tasks in its unknown environments based on the nondeterministic policy soft actor-critic (SAC) reinforced learning model. Specifically, the paper proposes a policy network based on a memory enhancement mechanism, which integrates the historical memory information processing with current observations to extract the temporal dependency of the statements so as to enhance the state estimation ability under locally observable conditions and avoid the learning algorithm from falling into a locally optimal solution. In addition, a non-sparse reward function is designed to reduce the challenge of the reinforced learning strategy to converge under sparse reward conditions. Finally, several complex scenarios are trained and validated in the Airsim+UE4 simulation platform. The experimental results show that the proposed method has a navigation success rate 10% higher than that of the benchmark algorithm and that the average flight distance is 21% shorter, which effectively enhances the stability and convergence of the UAV autonomous navigation algorithm.

KW - autonomous navigation

KW - deep reinforced learning

KW - soft actor-critic

KW - unmanned aerial vehicle

UR - http://www.scopus.com/inward/record.url?scp=85193525713&partnerID=8YFLogxK

U2 - 10.1051/jnwpu/20244220310

DO - 10.1051/jnwpu/20244220310

M3 - 文章

AN - SCOPUS:85193525713

SN - 1000-2758

VL - 42

SP - 310

EP - 318

JO - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

JF - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

IS - 2

ER -

基于 SAC 的无人机自主导航方法研究

Abstract

Access to Document

Other files and links

Fingerprint

Cite this