TY - GEN
T1 - Bayesian analysis for fault location in homogeneous distributed systems
AU - Chang, Yu Lo Cyrus
AU - Lander, Leslie C.
AU - Lu, Horng Shing
AU - Wells, Martin T.
PY - 1993/12/1
Y1 - 1993/12/1
N2 - We propose a simple and practical probabilistic comparison-based model, employing multiple incomplete test concepts, for handling fault location in distributed systems using a Bayesian analysis procedure. This approach is more practical and complete than previous ones since it does not assume any conditions such as permanently faulty units, complete tests, perfect environments, or non-malicious environments. Fault-free systems are handled without overhead, hence the test procedure may be used to monitor a functioning system. Given a system S with a specific test graph, the corresponding conditional distribution between the comparison test results (syndrome) and the fault patterns of S can be generated. To avoid the complex global Bayesian estimation process, we develop a simple bitwise Bayesian (B-) algorithm for fault location in S, which locates system failures with linear complexity, suitable for hard real-time systems.
AB - We propose a simple and practical probabilistic comparison-based model, employing multiple incomplete test concepts, for handling fault location in distributed systems using a Bayesian analysis procedure. This approach is more practical and complete than previous ones since it does not assume any conditions such as permanently faulty units, complete tests, perfect environments, or non-malicious environments. Fault-free systems are handled without overhead, hence the test procedure may be used to monitor a functioning system. Given a system S with a specific test graph, the corresponding conditional distribution between the comparison test results (syndrome) and the fault patterns of S can be generated. To avoid the complex global Bayesian estimation process, we develop a simple bitwise Bayesian (B-) algorithm for fault location in S, which locates system failures with linear complexity, suitable for hard real-time systems.
UR - http://www.scopus.com/inward/record.url?scp=0027794314&partnerID=8YFLogxK
U2 - 10.1109/RELDIS.1993.393474
DO - 10.1109/RELDIS.1993.393474
M3 - Conference contribution
AN - SCOPUS:0027794314
SN - 0818643129
T3 - Proc 12th Symp Reliab Distrib Syst
SP - 44
EP - 52
BT - Proc 12th Symp Reliab Distrib Syst
A2 - Anon, null
PB - Publ by IEEE
T2 - Proceedings of the 12th Symposium on Reliable Distributed Systems
Y2 - 6 October 1993 through 8 October 1993
ER -