Nezha: Interpretable fine-grained root causes analysis for microservices on multi-modal observability data
Root cause analysis (RCA) in large-scale microservice systems is a critical and challenging
task. To understand and localize root causes of unexpected faults, modern observability …
task. To understand and localize root causes of unexpected faults, modern observability …
Failure Diagnosis in Microservice Systems: A Comprehensive Survey and Analysis
Modern microservice systems have gained widespread adoption due to their high
scalability, flexibility, and extensibility. However, the characteristics of independent …
scalability, flexibility, and extensibility. However, the characteristics of independent …
Predictive monitoring against pattern regular languages
While current bug detection techniques for concurrent software focus on unearthing low-
level issues such as data races or deadlocks, they often fall short of discovering more …
level issues such as data races or deadlocks, they often fall short of discovering more …
Logrule: Efficient structured log mining for root cause analysis
Accurate, timely Root Cause Analysis (RCA) is essential to successful IT operations as a
primary step to incident remediation. RCA automation using data mining techniques in large …
primary step to incident remediation. RCA automation using data mining techniques in large …
Explaining mispredictions of machine learning models using rule induction
While machine learning (ML) models play an increasingly prevalent role in many software
engineering tasks, their prediction accuracy is often problematic. When these models do …
engineering tasks, their prediction accuracy is often problematic. When these models do …
Conan: Diagnosing batch failures for cloud systems
Failure diagnosis is critical to the maintenance of large-scale cloud systems, which has
attracted tremendous attention from academia and industry over the last decade. In this …
attracted tremendous attention from academia and industry over the last decade. In this …
What is an app store? The software engineering perspective
Abstract “App stores” are online software stores where end users may browse, purchase,
download, and install software applications. By far, the best known app stores are …
download, and install software applications. By far, the best known app stores are …
[PDF][PDF] Expert perspectives on explainability
MAY/JUNE 2023| IEEE SOFTWARE 85 for changes, and these sorts of signals. We are also
looking to capture discussions that happen around code in internal discussion forums (a bit …
looking to capture discussions that happen around code in internal discussion forums (a bit …
Root cause analysis of anomalies based on graph convolutional neural network
Z Li, Y Tu, Z Ma - International Journal of Software Engineering and …, 2022 - World Scientific
With the gradual increase of network complexity and network scale in the cloud
environment, Root Cause Analysis (RCA) of node failures has become a systematic problem …
environment, Root Cause Analysis (RCA) of node failures has become a systematic problem …
Trace-based Multi-Dimensional Root Cause Localization of Performance Issues in Microservice Systems
Modern microservice systems have become increasingly complicated due to the dynamic
and complex interactions and runtime environment. It leads to the system vulnerable to …
and complex interactions and runtime environment. It leads to the system vulnerable to …