TY - GEN
T1 - A similar resource auto-discovery based adaptive fault-tolerance method for embedded distributed system
AU - Zhang, Kailong
AU - Liang, Ke
AU - Zhou, Xingshe
AU - Wang, Kaibo
AU - Wu, Xiao
AU - Yang, Zhiyi
PY - 2007
Y1 - 2007
N2 - Because of the resource constraints and high reliability requirement of Embedded Distributed System (EDS), some new fault-tolerance means, which are different from the traditional hardware-redundancy ones, should be studied. In this article, a fault-tolerance method that based on similar resources and related technologies are proposed and discussed. First, several mathematical models of key elements, such as computing nodes, similar nodes and tasks, are constructed. Then, the similarity computation methods and evaluation criteria are evinced by two different views: tasks and resources. Supported by theories above, numerous methods, such as similar nodes auto-discovery (SNAD) and its optimization one (oSNAD), redundant tasks auto-deployment, and reconfiguration policies of fault tasks and nodes are highlighted respectively. Simulation results show that these approaches and schemes can improve the adaptive fault-tolerance abilities of complicated embedded distributed systems.
AB - Because of the resource constraints and high reliability requirement of Embedded Distributed System (EDS), some new fault-tolerance means, which are different from the traditional hardware-redundancy ones, should be studied. In this article, a fault-tolerance method that based on similar resources and related technologies are proposed and discussed. First, several mathematical models of key elements, such as computing nodes, similar nodes and tasks, are constructed. Then, the similarity computation methods and evaluation criteria are evinced by two different views: tasks and resources. Supported by theories above, numerous methods, such as similar nodes auto-discovery (SNAD) and its optimization one (oSNAD), redundant tasks auto-deployment, and reconfiguration policies of fault tasks and nodes are highlighted respectively. Simulation results show that these approaches and schemes can improve the adaptive fault-tolerance abilities of complicated embedded distributed systems.
UR - http://www.scopus.com/inward/record.url?scp=47749148398&partnerID=8YFLogxK
U2 - 10.1109/ICPPW.2007.15
DO - 10.1109/ICPPW.2007.15
M3 - 会议稿件
AN - SCOPUS:47749148398
SN - 0769529348
SN - 9780769529349
T3 - Proceedings of the International Conference on Parallel Processing Workshops
SP - 21
BT - 2007 International Conference on Parallel Processing Workshops, ICPPW
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2007 International Conference on Parallel Processing Workshops, ICPPW 2007
Y2 - 10 September 2007 through 14 September 2007
ER -