Prioritized Experience Replay-Based Path Planning Algorithm for Multiple UAVs

Chongde Ren; Jinchao Chen; Chenglie Du

doi:10.1155/2024/1809850

Prioritized Experience Replay-Based Path Planning Algorithm for Multiple UAVs

Chongde Ren, Jinchao Chen, Chenglie Du

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

Abstract

Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE-MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE-MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy.

Original language	English
Article number	1809850
Journal	International Journal of Aerospace Engineering
Volume	2024
DOIs	https://doi.org/10.1155/2024/1809850
State	Published - 2024

Access to Document

10.1155/2024/1809850

Cite this

@article{9c7fb24168d340899559ebc1b08ac76e,

title = "Prioritized Experience Replay-Based Path Planning Algorithm for Multiple UAVs",

abstract = "Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE-MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE-MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy.",

author = "Chongde Ren and Jinchao Chen and Chenglie Du",

note = "Publisher Copyright: {\textcopyright} 2024 Chongde Ren et al.",

year = "2024",

doi = "10.1155/2024/1809850",

language = "英语",

volume = "2024",

journal = "International Journal of Aerospace Engineering",

issn = "1687-5966",

publisher = "John Wiley and Sons Ltd",

}

TY - JOUR

T1 - Prioritized Experience Replay-Based Path Planning Algorithm for Multiple UAVs

AU - Ren, Chongde

AU - Chen, Jinchao

AU - Du, Chenglie

PY - 2024

Y1 - 2024

N2 - Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE-MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE-MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy.

AB - Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE-MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE-MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy.

UR - http://www.scopus.com/inward/record.url?scp=85201732902&partnerID=8YFLogxK

U2 - 10.1155/2024/1809850

DO - 10.1155/2024/1809850

M3 - 文章

AN - SCOPUS:85201732902

SN - 1687-5966

VL - 2024

JO - International Journal of Aerospace Engineering

JF - International Journal of Aerospace Engineering

M1 - 1809850

ER -

Prioritized Experience Replay-Based Path Planning Algorithm for Multiple UAVs

Abstract

Access to Document

Other files and links

Fingerprint

Cite this