TY - JOUR
T1 - Prioritized Experience Replay-Based Path Planning Algorithm for Multiple UAVs
AU - Ren, Chongde
AU - Chen, Jinchao
AU - Du, Chenglie
N1 - Publisher Copyright:
© 2024 Chongde Ren et al.
PY - 2024
Y1 - 2024
N2 - Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE-MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE-MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy.
AB - Unmanned aerial vehicles (UAVs) have been extensively researched and deployed in both military and civilian applications due to their tiny size, low cost, and great ease. Although UAVs working together on complicated jobs can significantly increase productivity and reduce costs, they can cause major issues with path planning. In complex environments, the path planning problem, which is a multiconstraint combinatorial optimization problem and hard to settle, requires considering numerous constraints and limitations and generates the best paths for each UAV to accomplish group tasks. In this paper, we study the path planning problem for multiple UAVs and propose a reinforcement learning algorithm: PERDE-MADDPG based on prioritized experience replay (PER) and delayed update skills. First, we adopt a PER mechanism based on temporal difference (TD) error to enhance the efficiency of experience utilization and accelerate the convergence speed of the algorithm. Second, we use delayed updates in the process of updating network parameters to ensure stability in training multiple agents. Finally, we propose the PERDE-MADDPG algorithm based on PER and delayed update skills, which is evaluated against the MATD3, MADDPG, and SAC methods in simulation scenarios to confirm its efficacy.
UR - http://www.scopus.com/inward/record.url?scp=85201732902&partnerID=8YFLogxK
U2 - 10.1155/2024/1809850
DO - 10.1155/2024/1809850
M3 - 文章
AN - SCOPUS:85201732902
SN - 1687-5966
VL - 2024
JO - International Journal of Aerospace Engineering
JF - International Journal of Aerospace Engineering
M1 - 1809850
ER -