Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games

Litong Fan, Dengxiu Yu, Zhen Wang

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

This article presents a framework for exploring optimal evolutionary strategies in continuous-action social dilemma games with a hierarchical structure comprising a leader and multifollowers. Previous studies in game theory have frequently overlooked the hierarchical structure among individuals, assuming that decisions are made simultaneously. Here, we propose a hierarchical structure for continuous action games that involves a leader and followers to enhance cooperation. The optimal evolutionary strategy for the leader is to guide the followers' actions to maximize overall benefits by exerting minimal control, while the followers aim to maximize their payoff by making minimal changes to their strategies. We establish the coupled Hamilton-Jacobi-Bellman (HJB) equations to find the optimal evolutionary strategy. To address the complexity of asymmetric roles arising from the leader-follower structure, we introduce an integral reinforcement learning (RL) algorithm known as two-level heuristic dynamic programming (HDP)-based value iteration (VI). The implementation of the algorithm utilizes neural networks (NNs) to approximate the value functions. Moreover, the convergence of the proposed algorithm is demonstrated. Additionally, three social dilemma models are presented to validate the efficacy of the proposed algorithm.

Original languageEnglish
Pages (from-to)6807-6818
Number of pages12
JournalIEEE Transactions on Computational Social Systems
Volume11
Issue number5
DOIs
StatePublished - 2024

Keywords

  • Hamilton-Jacobi-Bellman (HJB)
  • hierarchical
  • integral reinforcement learning
  • social dilemma
  • value iteration (VI)

Fingerprint

Dive into the research topics of 'Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games'. Together they form a unique fingerprint.

Cite this