Extending Q-learning to continuous and mixed strategy games based on spatial reciprocity

Lu Wang; Long Zhang; Yang Liu; Zhen Wang

doi:10.1098/rspa.2022.0667

Extending Q-learning to continuous and mixed strategy games based on spatial reciprocity

Lu Wang, Long Zhang, Yang Liu, Zhen Wang

网络空间安全学院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

12 引用（Scopus）

摘要

The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q-learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q-learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours' strategy into account. As a result, the learning state and action space is transformed into a 5×5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.

源语言	英语
文章编号	20220667
期刊	Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
卷	479
期	2274
DOI	https://doi.org/10.1098/rspa.2022.0667
出版状态	已出版 - 28 6月 2023

访问文件

10.1098/rspa.2022.0667

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{b3f976adf0d14ca4a4cb2dc1ef3a7752,

title = "Extending Q-learning to continuous and mixed strategy games based on spatial reciprocity",

abstract = "The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q-learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q-learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours' strategy into account. As a result, the learning state and action space is transformed into a 5×5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.",

keywords = "continuous strategy, mixed strategy, Q-learning, spatial reciprocity",

author = "Lu Wang and Long Zhang and Yang Liu and Zhen Wang",

note = "Publisher Copyright: {\textcopyright} 2023 The Author(s).",

year = "2023",

month = jun,

day = "28",

doi = "10.1098/rspa.2022.0667",

language = "英语",

volume = "479",

journal = "Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences",

issn = "1364-5021",

publisher = "Royal Society Publishing",

number = "2274",

}

TY - JOUR

T1 - Extending Q-learning to continuous and mixed strategy games based on spatial reciprocity

AU - Wang, Lu

AU - Zhang, Long

AU - Liu, Yang

AU - Wang, Zhen

PY - 2023/6/28

Y1 - 2023/6/28

N2 - The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q-learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q-learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours' strategy into account. As a result, the learning state and action space is transformed into a 5×5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.

AB - The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q-learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q-learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours' strategy into account. As a result, the learning state and action space is transformed into a 5×5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.

KW - continuous strategy

KW - mixed strategy

KW - Q-learning

KW - spatial reciprocity

UR - http://www.scopus.com/inward/record.url?scp=85162183630&partnerID=8YFLogxK

U2 - 10.1098/rspa.2022.0667

DO - 10.1098/rspa.2022.0667

M3 - 文章

AN - SCOPUS:85162183630

SN - 1364-5021

VL - 479

JO - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences

JF - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences

IS - 2274

M1 - 20220667

ER -

Extending Q-learning to continuous and mixed strategy games based on spatial reciprocity

摘要

访问文件

其它文件与链接

指纹

引用此