A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

Jing Cheng; Wen Tan; Guangzhe Lv; Guodong Li; Wentao Zhang; Zihao Liu

doi:10.2298/CSIS231129035C

A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

Jing Cheng, Wen Tan, Guangzhe Lv, Guodong Li, Wentao Zhang, Zihao Liu

School of Cybersecurity

Research output: Contribution to journal › Article › peer-review

Abstract

Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.

Original language	English
Pages (from-to)	1335-1357
Number of pages	23
Journal	Computer Science and Information Systems
Volume	21
Issue number	4
DOIs	https://doi.org/10.2298/CSIS231129035C
State	Published - Sep 2024

Keywords

Integrated modular avionics system
Multi-Agent Reinforcement Learning
multi-objective optimization problem
reconfiguration blueprint

Access to Document

10.2298/CSIS231129035C

Cite this

@article{f04ed5d5a833431bae3cee904a949fc8,

title = "A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning",

abstract = "Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.",

keywords = "Integrated modular avionics system, Multi-Agent Reinforcement Learning, multi-objective optimization problem, reconfiguration blueprint",

author = "Jing Cheng and Wen Tan and Guangzhe Lv and Guodong Li and Wentao Zhang and Zihao Liu",

year = "2024",

month = sep,

doi = "10.2298/CSIS231129035C",

language = "英语",

volume = "21",

pages = "1335--1357",

journal = "Computer Science and Information Systems",

issn = "1820-0214",

publisher = "ComSIS Consortium",

number = "4",

}

TY - JOUR

T1 - A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

AU - Cheng, Jing

AU - Tan, Wen

AU - Lv, Guangzhe

AU - Li, Guodong

AU - Zhang, Wentao

AU - Liu, Zihao

PY - 2024/9

Y1 - 2024/9

N2 - Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.

AB - Integrated modular avionics systems primarily achieve system fault tolerance by reconfiguring the system configuration blueprints. In the design of manual reconfiguration, the quality of reconfiguration blueprints is influenced by various unstable factors, leading to a certain degree of uncertainty. The effectiveness of reconfiguration blueprints depends on various factors, including load balancing, the impact of reconfiguration, and the time required for the process. Solving high-quality reconfiguration configuration blueprints can be regarded as a type of multi-objective optimization problem. Traditional algorithms have limitations in solving multi-objective optimization problems. Multi-Agent Reinforcement Learning (MARL) is an important branch in the field of machine learning. It enables the accomplishment of more complex tasks in dynamic real-world scenarios through interaction and decision-making. Combining Multi-Agent Reinforcement Learning algorithms with reconfiguration techniques and utilizing MARL methods to generate blueprints can optimize the quality of blueprints in multiple ways. In this paper, an Improved Value-Decomposition Networks (VDN) based on the average sequential cumulative reward is proposed. By refining the characteristics of the integrated modular avionics system, mathematical models are developed for both the system and the reconfiguration blueprint. The Improved VDN algorithm demonstrates superior convergence characteristics and optimization effects compared with traditional reinforcement learning algorithms such as Q-learning, Deep Q-learning Network (DQN), and VDN. This superiority has been confirmed through experiments involving single and continuous faults.

KW - Integrated modular avionics system

KW - Multi-Agent Reinforcement Learning

KW - multi-objective optimization problem

KW - reconfiguration blueprint

UR - http://www.scopus.com/inward/record.url?scp=85207391925&partnerID=8YFLogxK

U2 - 10.2298/CSIS231129035C

DO - 10.2298/CSIS231129035C

M3 - 文章

AN - SCOPUS:85207391925

SN - 1820-0214

VL - 21

SP - 1335

EP - 1357

JO - Computer Science and Information Systems

JF - Computer Science and Information Systems

IS - 4

ER -

A Method for Solving Reconfiguration Blueprints Based on Multi-Agent Reinforcement Learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this