Performance anomaly detection and bottleneck identification

O Ibidunmoye, F Hernández-Rodriguez… - ACM Computing Surveys …, 2015 - dl.acm.org
In order to meet stringent performance requirements, system administrators must effectively
detect undesirable performance behaviours, identify potential root causes, and take …

Sage: practical and scalable ML-driven performance debugging in microservices

Y Gan, M Liang, S Dev, D Lo, C Delimitrou - Proceedings of the 26th …, 2021 - dl.acm.org
Cloud applications are increasingly shifting from large monolithic services to complex
graphs of loosely-coupled microservices. Despite the advantages of modularity and …

A systematic mapping study in AIOps

P Notaro, J Cardoso, M Gerndt - International Conference on Service …, 2020 - Springer
IT systems of today are becoming larger and more complex, rendering their human
supervision more difficult. Artificial Intelligence for IT Operations (AIOps) has been proposed …

A survey of aiops methods for failure management

P Notaro, J Cardoso, M Gerndt - ACM Transactions on Intelligent …, 2021 - dl.acm.org
Modern society is increasingly moving toward complex and distributed computing systems.
The increase in scale and complexity of these systems challenges O&M teams that perform …

Studying the effectiveness of application performance management (apm) tools for detecting performance regressions for web applications: an experience report

TM Ahmed, CP Bezemer, TH Chen… - Proceedings of the 13th …, 2016 - dl.acm.org
Performance regressions, such as a higher CPU utilization than in the previous version of an
application, are caused by software application updates that negatively affect the …

Automated dynamic firmware analysis at scale: a case study on embedded web interfaces

A Costin, A Zarras, A Francillon - Proceedings of the 11th ACM on Asia …, 2016 - dl.acm.org
Embedded devices are becoming more widespread, interconnected, and web-enabled than
ever. However, recent studies showed that embedded devices are far from being secure …

Understanding and detecting real-world performance bugs

G Jin, L Song, X Shi, J Scherpelz, S Lu - ACM SIGPLAN Notices, 2012 - dl.acm.org
Developers frequently use inefficient code sequences that could be fixed by simple patches.
These inefficient code sequences can cause significant performance degradation and …

Microscope: Pinpoint performance issues with causal graphs in micro-service environments

JJ Lin, P Chen, Z Zheng - … , ICSOC 2018, Hangzhou, China, November 12 …, 2018 - Springer
Driven by the emerging business models (eg, digital sales) and IT technologies (eg, DevOps
and Cloud computing), the architecture of software is shifting from monolithic to microservice …

Learning to log: Helping developers make informed logging decisions

J Zhu, P He, Q Fu, H Zhang, MR Lyu… - 2015 IEEE/ACM 37th …, 2015 - ieeexplore.ieee.org
Logging is a common programming practice of practical importance to collect system
runtime information for postmortem analysis. Strategic logging placement is desired to cover …

Structured comparative analysis of systems logs to diagnose performance problems

K Nagaraj, C Killian, J Neville - 9th USENIX Symposium on Networked …, 2012 - usenix.org
Diagnosis and correction of performance issues in modern, large-scale distributed systems
can be a daunting task, since a single developer is unlikely to be familiar with the entire …