作者
Javier Alvarez Cid-Fuentes, Claudia Szabo, Katrina Falkner
发表日期
2018/4/2
期刊
IEEE Transactions on Dependable and Secure Computing
卷号
17
期号
5
页码范围
928-941
出版商
IEEE
简介
Performance anomaly detection is crucial for long running, large scale distributed systems. However, existing works focus on the detection of specific types of anomalies, rely on historical failure data, and cannot adapt to changes in system behavior at run time. In this work, we propose an adaptive framework for the detection and identification of complex anomalous behaviors, such as deadlocks and livelocks, in distributed systems without historical failure data. Our framework employs a two-step process involving two online SVM classifiers on periodically collected system metrics to identify at run time normal and anomalous behaviors such as deadlock, livelock, unwanted synchronization, and memory leaks. Our approach achieves over 0.70 F-score in detecting previously unseen anomalies and 0.78 F-score in identifying the type of known anomalies with a short delay after the anomalies appear, and with minimal …
引用总数
2018201920202021202220232024256103115
学术搜索中的文章
JA Cid-Fuentes, C Szabo, K Falkner - IEEE Transactions on Dependable and Secure …, 2018