Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method

Zhao Xu; Yang Lyu; Quan Pan; Jinwen Hu; Chunhui Zhao; Shuai Liu

doi:10.1109/ICCA.2018.8444355

Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method

Zhao Xu, Yang Lyu, Quan Pan, Jinwen Hu, Chunhui Zhao, Shuai Liu

Shandong University

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

37 Scopus citations

Abstract

Flocking control has been studied extensively along with the wide application of multi-vehicle systems. In this paper the Multi-vehicles System (MVS) flocking control with collision avoidance and communication preserving is considered based on the deep reinforcement learning framework. Specifically the deep deterministic policy gradient (DDPG) with centralized training and distributed execution process is implemented to obtain the flocking control policy. First, to avoid the dynamically changed observation of state, a three layers tensor based representation of the observation is used so that the state remains constant although the observation dimension is changing. A reward function is designed to guide the way-points tracking, collision avoidance and communication preserving. The reward function is augmented by introducing the local reward function of neighbors. Finally, a centralized training process which trains the shared policy based on common training set among all agents. The proposed method is tested under simulated scenarios with different setup.

Original language	English
Title of host publication	2018 IEEE 14th International Conference on Control and Automation, ICCA 2018
Publisher	IEEE Computer Society
Pages	306-311
Number of pages	6
ISBN (Print)	9781538660898
DOIs	https://doi.org/10.1109/ICCA.2018.8444355
State	Published - 21 Aug 2018
Event	14th IEEE International Conference on Control and Automation, ICCA 2018 - Anchorage, United States Duration: 12 Jun 2018 → 15 Jun 2018

Publication series

Name	IEEE International Conference on Control and Automation, ICCA
Volume	2018-June
ISSN (Print)	1948-3449
ISSN (Electronic)	1948-3457

Conference

Conference	14th IEEE International Conference on Control and Automation, ICCA 2018
Country/Territory	United States
City	Anchorage
Period	12/06/18 → 15/06/18

Access to Document

10.1109/ICCA.2018.8444355

Cite this

Xu, Z., Lyu, Y., Pan, Q., Hu, J., Zhao, C., & Liu, S. (2018). Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method. In 2018 IEEE 14th International Conference on Control and Automation, ICCA 2018 (pp. 306-311). Article 8444355 (IEEE International Conference on Control and Automation, ICCA; Vol. 2018-June). IEEE Computer Society. https://doi.org/10.1109/ICCA.2018.8444355

@inproceedings{7ccfa2960f9f4b1b81ee5c9c0297d0ad,

title = "Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method",

abstract = "Flocking control has been studied extensively along with the wide application of multi-vehicle systems. In this paper the Multi-vehicles System (MVS) flocking control with collision avoidance and communication preserving is considered based on the deep reinforcement learning framework. Specifically the deep deterministic policy gradient (DDPG) with centralized training and distributed execution process is implemented to obtain the flocking control policy. First, to avoid the dynamically changed observation of state, a three layers tensor based representation of the observation is used so that the state remains constant although the observation dimension is changing. A reward function is designed to guide the way-points tracking, collision avoidance and communication preserving. The reward function is augmented by introducing the local reward function of neighbors. Finally, a centralized training process which trains the shared policy based on common training set among all agents. The proposed method is tested under simulated scenarios with different setup.",

author = "Zhao Xu and Yang Lyu and Quan Pan and Jinwen Hu and Chunhui Zhao and Shuai Liu",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.; 14th IEEE International Conference on Control and Automation, ICCA 2018 ; Conference date: 12-06-2018 Through 15-06-2018",

year = "2018",

month = aug,

day = "21",

doi = "10.1109/ICCA.2018.8444355",

language = "英语",

isbn = "9781538660898",

series = "IEEE International Conference on Control and Automation, ICCA",

publisher = "IEEE Computer Society",

pages = "306--311",

booktitle = "2018 IEEE 14th International Conference on Control and Automation, ICCA 2018",

}

Xu, Z , Lyu, Y , Pan, Q , Hu, J , Zhao, C & Liu, S 2018, Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method. in 2018 IEEE 14th International Conference on Control and Automation, ICCA 2018., 8444355, IEEE International Conference on Control and Automation, ICCA, vol. 2018-June, IEEE Computer Society, pp. 306-311, 14th IEEE International Conference on Control and Automation, ICCA 2018, Anchorage, United States, 12/06/18. https://doi.org/10.1109/ICCA.2018.8444355

Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method. / Xu, Zhao ; Lyu, Yang ; Pan, Quan et al.
2018 IEEE 14th International Conference on Control and Automation, ICCA 2018. IEEE Computer Society, 2018. p. 306-311 8444355 (IEEE International Conference on Control and Automation, ICCA; Vol. 2018-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method

AU - Xu, Zhao

AU - Lyu, Yang

AU - Pan, Quan

AU - Hu, Jinwen

AU - Zhao, Chunhui

AU - Liu, Shuai

PY - 2018/8/21

Y1 - 2018/8/21

N2 - Flocking control has been studied extensively along with the wide application of multi-vehicle systems. In this paper the Multi-vehicles System (MVS) flocking control with collision avoidance and communication preserving is considered based on the deep reinforcement learning framework. Specifically the deep deterministic policy gradient (DDPG) with centralized training and distributed execution process is implemented to obtain the flocking control policy. First, to avoid the dynamically changed observation of state, a three layers tensor based representation of the observation is used so that the state remains constant although the observation dimension is changing. A reward function is designed to guide the way-points tracking, collision avoidance and communication preserving. The reward function is augmented by introducing the local reward function of neighbors. Finally, a centralized training process which trains the shared policy based on common training set among all agents. The proposed method is tested under simulated scenarios with different setup.

AB - Flocking control has been studied extensively along with the wide application of multi-vehicle systems. In this paper the Multi-vehicles System (MVS) flocking control with collision avoidance and communication preserving is considered based on the deep reinforcement learning framework. Specifically the deep deterministic policy gradient (DDPG) with centralized training and distributed execution process is implemented to obtain the flocking control policy. First, to avoid the dynamically changed observation of state, a three layers tensor based representation of the observation is used so that the state remains constant although the observation dimension is changing. A reward function is designed to guide the way-points tracking, collision avoidance and communication preserving. The reward function is augmented by introducing the local reward function of neighbors. Finally, a centralized training process which trains the shared policy based on common training set among all agents. The proposed method is tested under simulated scenarios with different setup.

UR - http://www.scopus.com/inward/record.url?scp=85053128664&partnerID=8YFLogxK

U2 - 10.1109/ICCA.2018.8444355

DO - 10.1109/ICCA.2018.8444355

M3 - 会议稿件

AN - SCOPUS:85053128664

SN - 9781538660898

T3 - IEEE International Conference on Control and Automation, ICCA

SP - 306

EP - 311

BT - 2018 IEEE 14th International Conference on Control and Automation, ICCA 2018

PB - IEEE Computer Society

T2 - 14th IEEE International Conference on Control and Automation, ICCA 2018

Y2 - 12 June 2018 through 15 June 2018

ER -

Multi-vehicle Flocking Control with Deep Deterministic Policy Gradient Method

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this