TY - JOUR
T1 - Optimal Evolution Strategy for Continuous Strategy Games on Complex Networks via Reinforcement Learning
AU - Fan, Litong
AU - Yu, Dengxiu
AU - Cheong, Kang Hao
AU - Wang, Zhen
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2024
Y1 - 2024
N2 - This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.
AB - This article presents an optimal evolution strategy for continuous strategy games on complex networks via reinforcement learning (RL). In the past, evolutionary game theory usually assumed that agents use the same selection intensity when interacting, ignoring the differences in their learning abilities and learning willingness. Individuals are reluctant to change their strategies too much. Therefore, we design an adaptive strategy updating framework with various selection intensities for continuous strategy games on complex networks based on imitation dynamics, allowing agents to achieve the optimal state and a higher cooperation level with the minimal strategy changes. The optimal updating strategy is acquired using a coupled Hamilton-Jacobi-Bellman (HJB) equation by minimizing the performance function. This function aims to maximize individual payoffs while minimizing strategy changes. Furthermore, a value iteration (VI) RL algorithm is proposed to approximate the HJB solutions and learn the optimal strategy updating rules. The RL algorithm employs actor and critic neural networks to approximate strategy changes and performance functions, along with the gradient descent weight update approach. Meanwhile, the stability and convergence of the proposed methods have been proved by the designed Lyapunov function. Simulations validate the convergence and effectiveness of the proposed methods in different games and complex networks.
KW - Continuous strategy games
KW - evolutionary dynamic
KW - Hamilton-Jacobi-Bellman (HJB)
KW - reinforcement learning (RL)
KW - strategy updating rules
UR - http://www.scopus.com/inward/record.url?scp=85204966895&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2024.3453385
DO - 10.1109/TNNLS.2024.3453385
M3 - 文章
AN - SCOPUS:85204966895
SN - 2162-237X
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
ER -