TY - JOUR
T1 - Extrinsic-and-Intrinsic Reward-Based Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Target Encirclement
AU - Chen, Jinchao
AU - Wang, Yang
AU - Zhang, Ying
AU - Lu, Yantao
AU - Shu, Qiuhao
AU - Hu, Yujiao
N1 - Publisher Copyright:
© 2000-2011 IEEE.
PY - 2025
Y1 - 2025
N2 - Due to their high flexibility and strong maneuverability, unmanned aerial vehicles (UAVs) have attracted lots of attention and are widely employed in many fields. Especially in target encirclement applications, UAVs have shown great advantages in adaptability and reliability, and can efficiently fly to and evenly surround the targets in complex and dynamic environments. In this paper, we concentrate on the cooperative target encirclement problem of heterogeneous UAVs and try to propose a multi-agent reinforcement learning approach to solve the problem. First, with the models of heterogeneous UAVs and obstacles, we analyze the collision avoidance, motion continuity, and energy consumption constraints of UAVs, and formulate the cooperative target encirclement problem as a multi-constraint combinatorial optimization one. Then, inspired by the humans' learning experience that curiosity provides a powerful motivator for humans to explore, discover, and acquire new knowledge, we propose an extrinsic-and-intrinsic reward-based multi-agent reinforcement learning approach to cooperatively control the behaviors of UAVs and achieve the target encirclement missions. Simulation experiments with randomly generated environments are conducted to evaluate the performance of our approach, and the results show that our approach has a significant advantage in terms of average reward, encirclement success rate, encirclement time, and encirclement energy consumption.
AB - Due to their high flexibility and strong maneuverability, unmanned aerial vehicles (UAVs) have attracted lots of attention and are widely employed in many fields. Especially in target encirclement applications, UAVs have shown great advantages in adaptability and reliability, and can efficiently fly to and evenly surround the targets in complex and dynamic environments. In this paper, we concentrate on the cooperative target encirclement problem of heterogeneous UAVs and try to propose a multi-agent reinforcement learning approach to solve the problem. First, with the models of heterogeneous UAVs and obstacles, we analyze the collision avoidance, motion continuity, and energy consumption constraints of UAVs, and formulate the cooperative target encirclement problem as a multi-constraint combinatorial optimization one. Then, inspired by the humans' learning experience that curiosity provides a powerful motivator for humans to explore, discover, and acquire new knowledge, we propose an extrinsic-and-intrinsic reward-based multi-agent reinforcement learning approach to cooperatively control the behaviors of UAVs and achieve the target encirclement missions. Simulation experiments with randomly generated environments are conducted to evaluate the performance of our approach, and the results show that our approach has a significant advantage in terms of average reward, encirclement success rate, encirclement time, and encirclement energy consumption.
KW - cooperative target encirclement
KW - extrinsic-and-intrinsic reward mechanism
KW - heterogeneous unmanned aerial vehicle
KW - Multi-agent reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85216334185&partnerID=8YFLogxK
U2 - 10.1109/TITS.2024.3524562
DO - 10.1109/TITS.2024.3524562
M3 - 文章
AN - SCOPUS:85216334185
SN - 1524-9050
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
ER -