TY - GEN
T1 - Root cause analysis and proactive problem prediction for self-healing
AU - Shunshan, Piao
AU - Jeongmin, Park
AU - Eunseok, Lee
PY - 2007
Y1 - 2007
N2 - As the rapid evolvement of distributed computing system, the requirements imposed on problem determination techniques are increased to help system control and manage in high levels of automated ways, which represents the capability of self-healing. Many Artificial Intelligent approaches are widely used in the fields of fault managements. In this paper, we propose an approach to fault management for self-healing system through learning and analyzing real-time information, to provide both root cause analysis and proactive problem prediction. Using Bayesian network algorithm, we describe a complex system as a compact model that presents probabilistic dependency relationships between various factors in such a domain. We also provide an improved process that deals with collected parameters in advance, which enhances learning efficiency and reduces learning time. For estimating the efficiency and accuracy, an experimental demonstration based on system performance measurements is implemented and evaluated via diverse comparisons, which shows the availability is optimistic.
AB - As the rapid evolvement of distributed computing system, the requirements imposed on problem determination techniques are increased to help system control and manage in high levels of automated ways, which represents the capability of self-healing. Many Artificial Intelligent approaches are widely used in the fields of fault managements. In this paper, we propose an approach to fault management for self-healing system through learning and analyzing real-time information, to provide both root cause analysis and proactive problem prediction. Using Bayesian network algorithm, we describe a complex system as a compact model that presents probabilistic dependency relationships between various factors in such a domain. We also provide an improved process that deals with collected parameters in advance, which enhances learning efficiency and reduces learning time. For estimating the efficiency and accuracy, an experimental demonstration based on system performance measurements is implemented and evaluated via diverse comparisons, which shows the availability is optimistic.
UR - https://www.scopus.com/pages/publications/49049101876
U2 - 10.1109/ICCIT.2007.4420561
DO - 10.1109/ICCIT.2007.4420561
M3 - Conference contribution
AN - SCOPUS:49049101876
SN - 0769530389
SN - 9780769530383
T3 - 2007 International Conference on Convergence Information Technology, ICCIT 2007
SP - 2085
EP - 2090
BT - 2007 International Conference on Convergence Information Technology, ICCIT 2007
T2 - 2nd International Conference on Convergent Information Technology, ICCIT 07
Y2 - 21 November 2007 through 23 November 2007
ER -