A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

Jing Cheng; Wen Tan; Guangzhe Lv; Guodong Li; Wentao Zhang; Zihao Liu

doi:10.2298/CSIS231129035C

A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

Jing Cheng, Wen Tan, Guangzhe Lv, Guodong Li, Wentao Zhang, Zihao Liu

网络空间安全学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.

源语言	英语
页（从-至）	1335-1357
页数	23
期刊	Computer Science and Information Systems
卷	21
期	4
DOI	https://doi.org/10.2298/CSIS231129035C
出版状态	已出版 - 9月 2024

访问文件

10.2298/CSIS231129035C

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{f04ed5d5a833431bae3cee904a949fc8,

title = "A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning",

abstract = "Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.",

keywords = "Integrated modular avionics system, Multi-Agent Reinforcement Learning, multi-objective optimization problem, reconfiguration blueprint",

author = "Jing Cheng and Wen Tan and Guangzhe Lv and Guodong Li and Wentao Zhang and Zihao Liu",

year = "2024",

month = sep,

doi = "10.2298/CSIS231129035C",

language = "英语",

volume = "21",

pages = "1335--1357",

journal = "Computer Science and Information Systems",

issn = "1820-0214",

publisher = "ComSIS Consortium",

number = "4",

}

TY - JOUR

T1 - A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

AU - Cheng, Jing

AU - Tan, Wen

AU - Lv, Guangzhe

AU - Li, Guodong

AU - Zhang, Wentao

AU - Liu, Zihao

PY - 2024/9

Y1 - 2024/9

N2 - Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.

AB - Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.

KW - Integrated modular avionics system

KW - Multi-Agent Reinforcement Learning

KW - multi-objective optimization problem

KW - reconfiguration blueprint

UR - http://www.scopus.com/inward/record.url?scp=85207391925&partnerID=8YFLogxK

U2 - 10.2298/CSIS231129035C

DO - 10.2298/CSIS231129035C

M3 - 文章

AN - SCOPUS:85207391925

SN - 1820-0214

VL - 21

SP - 1335

EP - 1357

JO - Computer Science and Information Systems

JF - Computer Science and Information Systems

IS - 4

ER -

A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

摘要

访问文件

其它文件与链接

指纹

引用此