TY - JOUR
T1 - Extending Q-learning to continuous and mixed strategy games based on spatial reciprocity
AU - Wang, Lu
AU - Zhang, Long
AU - Liu, Yang
AU - Wang, Zhen
N1 - Publisher Copyright:
© 2023 The Author(s).
PY - 2023/6/28
Y1 - 2023/6/28
N2 - The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q-learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q-learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours' strategy into account. As a result, the learning state and action space is transformed into a 5×5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.
AB - The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q-learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q-learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours' strategy into account. As a result, the learning state and action space is transformed into a 5×5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.
KW - continuous strategy
KW - mixed strategy
KW - Q-learning
KW - spatial reciprocity
UR - http://www.scopus.com/inward/record.url?scp=85162183630&partnerID=8YFLogxK
U2 - 10.1098/rspa.2022.0667
DO - 10.1098/rspa.2022.0667
M3 - 文章
AN - SCOPUS:85162183630
SN - 1364-5021
VL - 479
JO - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
JF - Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
IS - 2274
M1 - 20220667
ER -