A Deep Reinforcement Learning Based Leader-Follower Control Policy for Swarm Systems

Di Cui, Huiping Li, Rizhong Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper is concerned with the learning-based control problem for large-scale robotic swarm systems, which makes the single leader able to herd the follower swarm systems to form a target distribution. We use the mean-field model to describe the spatio-temporal evolution of the probability density of the follower swarm, under which the physical space is divided into several bins and the leader control policy only depends on the density distribution over these bins. Therefore, the designed control policy is free from the computation issue raised by the large number of follower agents N. A deep reinforcement learning (DRL) algorithm is designed here to learn the leader control policy and accommodate the variation of the follower density. It is verified that the proposed control policy is much more efficient than existing results in terms of control performance and training time.

Original languageEnglish
Title of host publicationIntelligent Networked Things - 5th China Conference, CINT 2022, Revised Selected Papers
EditorsLin Zhang, Wensheng Yu, Haijun Jiang, Yuanjun Laili
PublisherSpringer Science and Business Media Deutschland GmbH
Pages269-280
Number of pages12
ISBN (Print)9789811989148
DOIs
StatePublished - 2022
Event5th China Conference on Intelligent Networked Things, CINT 2022 - Virtual, Online
Duration: 7 Aug 20228 Aug 2022

Publication series

NameCommunications in Computer and Information Science
Volume1714 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference5th China Conference on Intelligent Networked Things, CINT 2022
CityVirtual, Online
Period7/08/228/08/22

Keywords

  • Deep reinforcement learning (DRL)
  • Leader-follower control
  • Mean-field model
  • Swarm systems

Fingerprint

Dive into the research topics of 'A Deep Reinforcement Learning Based Leader-Follower Control Policy for Swarm Systems'. Together they form a unique fingerprint.

Cite this