Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games

Litong Fan; Dengxiu Yu; Zhen Wang

doi:10.1109/TCSS.2024.3409833

Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games

Litong Fan, Dengxiu Yu, Zhen Wang

光电与智能研究院

Northwestern Polytechnical University Xian

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

This article presents a framework for exploring optimal evolutionary strategies in continuous-action social dilemma games with a hierarchical structure comprising a leader and multifollowers. Previous studies in game theory have frequently overlooked the hierarchical structure among individuals, assuming that decisions are made simultaneously. Here, we propose a hierarchical structure for continuous action games that involves a leader and followers to enhance cooperation. The optimal evolutionary strategy for the leader is to guide the followers' actions to maximize overall benefits by exerting minimal control, while the followers aim to maximize their payoff by making minimal changes to their strategies. We establish the coupled Hamilton-Jacobi-Bellman (HJB) equations to find the optimal evolutionary strategy. To address the complexity of asymmetric roles arising from the leader-follower structure, we introduce an integral reinforcement learning (RL) algorithm known as two-level heuristic dynamic programming (HDP)-based value iteration (VI). The implementation of the algorithm utilizes neural networks (NNs) to approximate the value functions. Moreover, the convergence of the proposed algorithm is demonstrated. Additionally, three social dilemma models are presented to validate the efficacy of the proposed algorithm.

源语言	英语
页（从-至）	6807-6818
页数	12
期刊	IEEE Transactions on Computational Social Systems
卷	11
期	5
DOI	https://doi.org/10.1109/TCSS.2024.3409833
出版状态	已出版 - 2024

访问文件

10.1109/TCSS.2024.3409833

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{be46bc5c1ae8450589d385dc3aacf7f3,

title = "Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games",

abstract = "This article presents a framework for exploring optimal evolutionary strategies in continuous-action social dilemma games with a hierarchical structure comprising a leader and multifollowers. Previous studies in game theory have frequently overlooked the hierarchical structure among individuals, assuming that decisions are made simultaneously. Here, we propose a hierarchical structure for continuous action games that involves a leader and followers to enhance cooperation. The optimal evolutionary strategy for the leader is to guide the followers' actions to maximize overall benefits by exerting minimal control, while the followers aim to maximize their payoff by making minimal changes to their strategies. We establish the coupled Hamilton-Jacobi-Bellman (HJB) equations to find the optimal evolutionary strategy. To address the complexity of asymmetric roles arising from the leader-follower structure, we introduce an integral reinforcement learning (RL) algorithm known as two-level heuristic dynamic programming (HDP)-based value iteration (VI). The implementation of the algorithm utilizes neural networks (NNs) to approximate the value functions. Moreover, the convergence of the proposed algorithm is demonstrated. Additionally, three social dilemma models are presented to validate the efficacy of the proposed algorithm.",

keywords = "Hamilton-Jacobi-Bellman (HJB), hierarchical, integral reinforcement learning, social dilemma, value iteration (VI)",

author = "Litong Fan and Dengxiu Yu and Zhen Wang",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.",

year = "2024",

doi = "10.1109/TCSS.2024.3409833",

language = "英语",

volume = "11",

pages = "6807--6818",

journal = "IEEE Transactions on Computational Social Systems",

issn = "2329-924X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "5",

}

TY - JOUR

T1 - Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games

AU - Fan, Litong

AU - Yu, Dengxiu

AU - Wang, Zhen

PY - 2024

Y1 - 2024

N2 - This article presents a framework for exploring optimal evolutionary strategies in continuous-action social dilemma games with a hierarchical structure comprising a leader and multifollowers. Previous studies in game theory have frequently overlooked the hierarchical structure among individuals, assuming that decisions are made simultaneously. Here, we propose a hierarchical structure for continuous action games that involves a leader and followers to enhance cooperation. The optimal evolutionary strategy for the leader is to guide the followers' actions to maximize overall benefits by exerting minimal control, while the followers aim to maximize their payoff by making minimal changes to their strategies. We establish the coupled Hamilton-Jacobi-Bellman (HJB) equations to find the optimal evolutionary strategy. To address the complexity of asymmetric roles arising from the leader-follower structure, we introduce an integral reinforcement learning (RL) algorithm known as two-level heuristic dynamic programming (HDP)-based value iteration (VI). The implementation of the algorithm utilizes neural networks (NNs) to approximate the value functions. Moreover, the convergence of the proposed algorithm is demonstrated. Additionally, three social dilemma models are presented to validate the efficacy of the proposed algorithm.

AB - This article presents a framework for exploring optimal evolutionary strategies in continuous-action social dilemma games with a hierarchical structure comprising a leader and multifollowers. Previous studies in game theory have frequently overlooked the hierarchical structure among individuals, assuming that decisions are made simultaneously. Here, we propose a hierarchical structure for continuous action games that involves a leader and followers to enhance cooperation. The optimal evolutionary strategy for the leader is to guide the followers' actions to maximize overall benefits by exerting minimal control, while the followers aim to maximize their payoff by making minimal changes to their strategies. We establish the coupled Hamilton-Jacobi-Bellman (HJB) equations to find the optimal evolutionary strategy. To address the complexity of asymmetric roles arising from the leader-follower structure, we introduce an integral reinforcement learning (RL) algorithm known as two-level heuristic dynamic programming (HDP)-based value iteration (VI). The implementation of the algorithm utilizes neural networks (NNs) to approximate the value functions. Moreover, the convergence of the proposed algorithm is demonstrated. Additionally, three social dilemma models are presented to validate the efficacy of the proposed algorithm.

KW - Hamilton-Jacobi-Bellman (HJB)

KW - hierarchical

KW - integral reinforcement learning

KW - social dilemma

KW - value iteration (VI)

UR - http://www.scopus.com/inward/record.url?scp=85206212687&partnerID=8YFLogxK

U2 - 10.1109/TCSS.2024.3409833

DO - 10.1109/TCSS.2024.3409833

M3 - 文章

AN - SCOPUS:85206212687

SN - 2329-924X

VL - 11

SP - 6807

EP - 6818

JO - IEEE Transactions on Computational Social Systems

JF - IEEE Transactions on Computational Social Systems

IS - 5

ER -

Integral-Reinforcement-Learning-Based Hierarchical Optimal Evolutionary Strategy for Continuous Action Social Dilemma Games

摘要

访问文件

其它文件与链接

指纹

引用此