TY - JOUR
T1 - 多智能体编队控制中的迁移强化学习算法研究
AU - Hu, Penglin
AU - Pan, Quan
AU - Guo, Yaning
AU - Zhao, Chunhui
N1 - Publisher Copyright:
©2023 Journal of Northwestern Polytechnical University.
PY - 2023/4
Y1 - 2023/4
N2 - Considering the obstacle avoidance and collision avoidance for multi-agent cooperative formation in multi-obstacle environment, a formation control algorithm based on transfer learning and reinforcement learning is proposed. Firstly, in the source task learning stage, the large storage space required by Q-table solution is avoided by using the value function approximation method, which effectively reduces the storage space requirement and im- proves the solving speed of the algorithm. Secondly, in the learning phase of the target task, Gaussian clustering al- gorithm was used to classify the source tasks. According to the distance between the clustering center and the target task, the optimal source task class was selected for target task learning, which effectively avoided the negative transfer phenomenon, and improved the generalization ability and convergence speed of reinforcement learning algo- rithm. Finally, the simulation results show that this method can effectively form and maintain formation configuration of multi-agent system in complex environment with obstacles, and realize obstacle avoidance and collision avoidance at the same time.
AB - Considering the obstacle avoidance and collision avoidance for multi-agent cooperative formation in multi-obstacle environment, a formation control algorithm based on transfer learning and reinforcement learning is proposed. Firstly, in the source task learning stage, the large storage space required by Q-table solution is avoided by using the value function approximation method, which effectively reduces the storage space requirement and im- proves the solving speed of the algorithm. Secondly, in the learning phase of the target task, Gaussian clustering al- gorithm was used to classify the source tasks. According to the distance between the clustering center and the target task, the optimal source task class was selected for target task learning, which effectively avoided the negative transfer phenomenon, and improved the generalization ability and convergence speed of reinforcement learning algo- rithm. Finally, the simulation results show that this method can effectively form and maintain formation configuration of multi-agent system in complex environment with obstacles, and realize obstacle avoidance and collision avoidance at the same time.
KW - formation control
KW - Gaussian clustering
KW - multi-agent system
KW - transfer reinforcement learning
KW - value function approximation
UR - http://www.scopus.com/inward/record.url?scp=85162921022&partnerID=8YFLogxK
U2 - 10.1051/jnwpu/20234120389
DO - 10.1051/jnwpu/20234120389
M3 - 文章
AN - SCOPUS:85162921022
SN - 1000-2758
VL - 41
SP - 389
EP - 399
JO - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
JF - Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
IS - 2
ER -