D-bot: Database diagnosis system using large language models

X Zhou, G Li, Z Sun, Z Liu, W Chen, J Wu, J Liu… - Proceedings of the …, 2024 - dl.acm.org
Database administrators (DBAs) play an important role in managing database systems.
However, it is hard and tedious for DBAs to manage vast database instances and give timely …

Llm as dba

X Zhou, G Li, Z Liu - arXiv preprint arXiv:2308.05481, 2023 - arxiv.org
Database administrators (DBAs) play a crucial role in managing, maintaining and optimizing
a database system to ensure data availability, performance, and reliability. However, it is …

Failure Diagnosis in Microservice Systems: A Comprehensive Survey and Analysis

S Zhang, S Xia, W Fan, B Shi, X Xiong, Z Zhong… - arXiv preprint arXiv …, 2024 - arxiv.org
Modern microservice systems have gained widespread adoption due to their high
scalability, flexibility, and extensibility. However, the characteristics of independent …

A survey on intelligent management of alerts and incidents in IT services

Q Yu, N Zhao, M Li, Z Li, H Wang, W Zhang… - Journal of Network and …, 2024 - Elsevier
Modern service systems are constantly improving with the development of various IT
technologies, leading to a boost in system scales and complex dependencies among …

CMDiagnostor: An Ambiguity-Aware Root Cause Localization Approach Based on Call Metric Data

Q Yu, C Pei, B Hao, M Li, Z Li, S Zhang, X Lu… - Proceedings of the …, 2023 - dl.acm.org
The availability of online services is vital as its strong relevance to revenue and user
experience. To ensure online services' availability, quickly localizing the root causes of …

Microservice Root Cause Analysis With Limited Observability Through Intervention Recognition in the Latent Space

Z Xie, S Zhang, Y Geng, Y Zhang, M Ma, X Nie… - Proceedings of the 30th …, 2024 - dl.acm.org
Many failure root cause analysis (RCA) algorithms for microservices have been proposed
with the widespread adoption of microservices systems. Existing algorithms generally focus …

MetricSifter: Feature Reduction of Multivariate Time Series Data for Efficient Fault Localization in Cloud Applications

Y Tsubouchi, H Tsuruta - IEEE Access, 2024 - ieeexplore.ieee.org
Automated fault localization in large-scale cloud-based applications is challenging because
it involves mining multivariate time series data from large volumes of operational monitoring …

A Scenario-Oriented Benchmark for Assessing AIOps Algorithms in Microservice Management

Y Sun, J Wang, Z Li, X Nie, M Ma, S Zhang, Y Ji… - arXiv preprint arXiv …, 2024 - arxiv.org
AIOps algorithms play a crucial role in the maintenance of microservice systems. Many
previous benchmarks' performance leaderboard provides valuable guidance for selecting …

Performance diagnosis of oracle database systems based on image encoding and VGG16 model

X Liao, H Zheng, H Wang, M Hong, X Lin, X Zhu… - IEEE …, 2024 - ieeexplore.ieee.org
This paper proposes a novel multivariate performance diagnostic approach for the Oracle
database systems to detect performance degradation and crashes during database …

Illuminating the Gray Zone: Non-intrusive Gray Failure Localization in Server Operating Systems

S Zhang, Y Zhao, X Xiong, Y Sun, X Nie… - … Proceedings of the …, 2024 - dl.acm.org
Timely localization of the root causes of gray failure is essential for maintaining the stability
of the server OS. The previous intrusive gray failure localization methods usually require …