An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning

Dingwei Wu; Kaifang Wan; Jianqiang Tang; Xiaoguang Gao; Yiwei Zhai; Zhaohui Qi

doi:10.1109/ICCRE55123.2022.9770236

An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning

Dingwei Wu, Kaifang Wan, Jianqiang Tang, Xiaoguang Gao, Yiwei Zhai, Zhaohui Qi

School of Electronics and Information

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

11 Scopus citations

Abstract

Autonomous navigation is a key technology of multi-UAV systems, and deep reinforcement learning can endow UAVs with powerful autonomous decision-making capabilities. To improve the convergence speed and stability of reinforcement learning, this paper proposes a multi-agent deep deterministic policy gradient algorithm based on prioritized experience replay, namely PER-MADDPG. This algorithm makes the samples with higher priority have a higher probability of being chosen for the parameter update, which can speed up the algorithm convergence. Moreover, the actions of UAVs are generated utilizing parameter noise, which can improve the stability and robustness of the algorithm. Experiments show that PER-MADDPG has fast convergence speed and good convergence results, and has excellent autonomous navigation capabilities.

Original language	English
Title of host publication	2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	96-101
Number of pages	6
ISBN (Electronic)	9781665468404
DOIs	https://doi.org/10.1109/ICCRE55123.2022.9770236
State	Published - 2022
Event	7th International Conference on Control and Robotics Engineering, ICCRE 2022 - Beijing, China Duration: 15 Apr 2022 → 17 Apr 2022

Publication series

Name	2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022

Conference

Conference	7th International Conference on Control and Robotics Engineering, ICCRE 2022
Country/Territory	China
City	Beijing
Period	15/04/22 → 17/04/22

Keywords

autonomous navigation
MADDPG
multi-UAV
prioritized experience replay
reinforcement learning

Access to Document

10.1109/ICCRE55123.2022.9770236

Cite this

Wu, D., Wan, K., Tang, J., Gao, X., Zhai, Y., & Qi, Z. (2022). An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. In 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022 (pp. 96-101). (2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCRE55123.2022.9770236

Wu, Dingwei ; Wan, Kaifang ; Tang, Jianqiang et al. / An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022. Institute of Electrical and Electronics Engineers Inc., 2022. pp. 96-101 (2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022).

@inproceedings{4ed132725e9f4b9c9932e246734503df,

title = "An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning",

abstract = "Autonomous navigation is a key technology of multi-UAV systems, and deep reinforcement learning can endow UAVs with powerful autonomous decision-making capabilities. To improve the convergence speed and stability of reinforcement learning, this paper proposes a multi-agent deep deterministic policy gradient algorithm based on prioritized experience replay, namely PER-MADDPG. This algorithm makes the samples with higher priority have a higher probability of being chosen for the parameter update, which can speed up the algorithm convergence. Moreover, the actions of UAVs are generated utilizing parameter noise, which can improve the stability and robustness of the algorithm. Experiments show that PER-MADDPG has fast convergence speed and good convergence results, and has excellent autonomous navigation capabilities.",

keywords = "autonomous navigation, MADDPG, multi-UAV, prioritized experience replay, reinforcement learning",

author = "Dingwei Wu and Kaifang Wan and Jianqiang Tang and Xiaoguang Gao and Yiwei Zhai and Zhaohui Qi",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 7th International Conference on Control and Robotics Engineering, ICCRE 2022 ; Conference date: 15-04-2022 Through 17-04-2022",

year = "2022",

doi = "10.1109/ICCRE55123.2022.9770236",

language = "英语",

series = "2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "96--101",

booktitle = "2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022",

}

Wu, D, Wan, K, Tang, J, Gao, X, Zhai, Y & Qi, Z 2022, An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. in 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022. 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022, Institute of Electrical and Electronics Engineers Inc., pp. 96-101, 7th International Conference on Control and Robotics Engineering, ICCRE 2022, Beijing, China, 15/04/22. https://doi.org/10.1109/ICCRE55123.2022.9770236

An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. / Wu, Dingwei; Wan, Kaifang; Tang, Jianqiang et al.
2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022. Institute of Electrical and Electronics Engineers Inc., 2022. p. 96-101 (2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning

AU - Wu, Dingwei

AU - Wan, Kaifang

AU - Tang, Jianqiang

AU - Gao, Xiaoguang

AU - Zhai, Yiwei

AU - Qi, Zhaohui

PY - 2022

Y1 - 2022

N2 - Autonomous navigation is a key technology of multi-UAV systems, and deep reinforcement learning can endow UAVs with powerful autonomous decision-making capabilities. To improve the convergence speed and stability of reinforcement learning, this paper proposes a multi-agent deep deterministic policy gradient algorithm based on prioritized experience replay, namely PER-MADDPG. This algorithm makes the samples with higher priority have a higher probability of being chosen for the parameter update, which can speed up the algorithm convergence. Moreover, the actions of UAVs are generated utilizing parameter noise, which can improve the stability and robustness of the algorithm. Experiments show that PER-MADDPG has fast convergence speed and good convergence results, and has excellent autonomous navigation capabilities.

AB - Autonomous navigation is a key technology of multi-UAV systems, and deep reinforcement learning can endow UAVs with powerful autonomous decision-making capabilities. To improve the convergence speed and stability of reinforcement learning, this paper proposes a multi-agent deep deterministic policy gradient algorithm based on prioritized experience replay, namely PER-MADDPG. This algorithm makes the samples with higher priority have a higher probability of being chosen for the parameter update, which can speed up the algorithm convergence. Moreover, the actions of UAVs are generated utilizing parameter noise, which can improve the stability and robustness of the algorithm. Experiments show that PER-MADDPG has fast convergence speed and good convergence results, and has excellent autonomous navigation capabilities.

KW - autonomous navigation

KW - MADDPG

KW - multi-UAV

KW - prioritized experience replay

KW - reinforcement learning

UR - http://www.scopus.com/inward/record.url?scp=85130622450&partnerID=8YFLogxK

U2 - 10.1109/ICCRE55123.2022.9770236

DO - 10.1109/ICCRE55123.2022.9770236

M3 - 会议稿件

AN - SCOPUS:85130622450

T3 - 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022

SP - 96

EP - 101

BT - 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 7th International Conference on Control and Robotics Engineering, ICCRE 2022

Y2 - 15 April 2022 through 17 April 2022

ER -

Wu D, Wan K, Tang J, Gao X, Zhai Y, Qi Z. An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning. In 2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022. Institute of Electrical and Electronics Engineers Inc. 2022. p. 96-101. (2022 7th International Conference on Control and Robotics Engineering, ICCRE 2022). doi: 10.1109/ICCRE55123.2022.9770236

An Improved Method towards Multi-UAV Autonomous Navigation Using Deep Reinforcement Learning

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this