A multi-agent reinforcement learning method with curriculum transfer for large-scale dynamic traffic signal control

Xuesi Li; Jingchen Li; Haobin Shi

doi:10.1007/s10489-023-04652-y

A multi-agent reinforcement learning method with curriculum transfer for large-scale dynamic traffic signal control

Xuesi Li, Jingchen Li, Haobin Shi

School of Computer Science

Northwestern Polytechnical University Xian

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

Using reinforcement learning to control traffic signal systems has been discussed in recent years, but most works focused on simple scenarios such as a single crossroads, and the methods aiming at large-scale traffic scenarios face long-time training and suboptimal results. In this work, we develop a new multi-agent reinforcement model for large-scale traffic signal control tasks, and a curriculum transfer learning method is developed to optimize the joint policy step by step. The policies for different intersections are trained in a partially observable Markov decision process with centralized training and decentralized execution mechanism, and we design transformer modules for both the policy and evaluation networks by attention mechanism. We first train policies in a simple traffic scenario, and then these policies are transferred to the next curriculum by policy reloading, while the experiences of the source task are reused selectively. With the number of agents increasing, our method can achieve satisfactory performances quickly by reusing the knowledge from previous curriculums. We conduct several experiments on the Cityflow testbed. In the case of more than 10 crossroads, our model improve the mean reward from 3.0 to 5.0.

Original language	English
Pages (from-to)	21433-21447
Number of pages	15
Journal	Applied Intelligence
Volume	53
Issue number	18
DOIs	https://doi.org/10.1007/s10489-023-04652-y
State	Published - Sep 2023

Keywords

Curriculum learning
Reinforcement learning
Traffic signal control

Access to Document

10.1007/s10489-023-04652-y

Cite this

@article{8731a01447254220906dcbde219f80cd,

title = "A multi-agent reinforcement learning method with curriculum transfer for large-scale dynamic traffic signal control",

abstract = "Using reinforcement learning to control traffic signal systems has been discussed in recent years, but most works focused on simple scenarios such as a single crossroads, and the methods aiming at large-scale traffic scenarios face long-time training and suboptimal results. In this work, we develop a new multi-agent reinforcement model for large-scale traffic signal control tasks, and a curriculum transfer learning method is developed to optimize the joint policy step by step. The policies for different intersections are trained in a partially observable Markov decision process with centralized training and decentralized execution mechanism, and we design transformer modules for both the policy and evaluation networks by attention mechanism. We first train policies in a simple traffic scenario, and then these policies are transferred to the next curriculum by policy reloading, while the experiences of the source task are reused selectively. With the number of agents increasing, our method can achieve satisfactory performances quickly by reusing the knowledge from previous curriculums. We conduct several experiments on the Cityflow testbed. In the case of more than 10 crossroads, our model improve the mean reward from 3.0 to 5.0.",

keywords = "Curriculum learning, Reinforcement learning, Traffic signal control",

author = "Xuesi Li and Jingchen Li and Haobin Shi",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.",

year = "2023",

month = sep,

doi = "10.1007/s10489-023-04652-y",

language = "英语",

volume = "53",

pages = "21433--21447",

journal = "Applied Intelligence",

issn = "0924-669X",

publisher = "Springer Netherlands",

number = "18",

}

TY - JOUR

T1 - A multi-agent reinforcement learning method with curriculum transfer for large-scale dynamic traffic signal control

AU - Li, Xuesi

AU - Li, Jingchen

AU - Shi, Haobin

PY - 2023/9

Y1 - 2023/9

N2 - Using reinforcement learning to control traffic signal systems has been discussed in recent years, but most works focused on simple scenarios such as a single crossroads, and the methods aiming at large-scale traffic scenarios face long-time training and suboptimal results. In this work, we develop a new multi-agent reinforcement model for large-scale traffic signal control tasks, and a curriculum transfer learning method is developed to optimize the joint policy step by step. The policies for different intersections are trained in a partially observable Markov decision process with centralized training and decentralized execution mechanism, and we design transformer modules for both the policy and evaluation networks by attention mechanism. We first train policies in a simple traffic scenario, and then these policies are transferred to the next curriculum by policy reloading, while the experiences of the source task are reused selectively. With the number of agents increasing, our method can achieve satisfactory performances quickly by reusing the knowledge from previous curriculums. We conduct several experiments on the Cityflow testbed. In the case of more than 10 crossroads, our model improve the mean reward from 3.0 to 5.0.

AB - Using reinforcement learning to control traffic signal systems has been discussed in recent years, but most works focused on simple scenarios such as a single crossroads, and the methods aiming at large-scale traffic scenarios face long-time training and suboptimal results. In this work, we develop a new multi-agent reinforcement model for large-scale traffic signal control tasks, and a curriculum transfer learning method is developed to optimize the joint policy step by step. The policies for different intersections are trained in a partially observable Markov decision process with centralized training and decentralized execution mechanism, and we design transformer modules for both the policy and evaluation networks by attention mechanism. We first train policies in a simple traffic scenario, and then these policies are transferred to the next curriculum by policy reloading, while the experiences of the source task are reused selectively. With the number of agents increasing, our method can achieve satisfactory performances quickly by reusing the knowledge from previous curriculums. We conduct several experiments on the Cityflow testbed. In the case of more than 10 crossroads, our model improve the mean reward from 3.0 to 5.0.

KW - Curriculum learning

KW - Reinforcement learning

KW - Traffic signal control

UR - http://www.scopus.com/inward/record.url?scp=85160823915&partnerID=8YFLogxK

U2 - 10.1007/s10489-023-04652-y

DO - 10.1007/s10489-023-04652-y

M3 - 文章

AN - SCOPUS:85160823915

SN - 0924-669X

VL - 53

SP - 21433

EP - 21447

JO - Applied Intelligence

JF - Applied Intelligence

IS - 18

ER -

A multi-agent reinforcement learning method with curriculum transfer for large-scale dynamic traffic signal control

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this