Online diagnosis of performance variation in HPC systems using machine learning
As the size and complexity of high performance computing (HPC) systems grow in line with
advancements in hardware and software technology, HPC systems increasingly suffer from …
advancements in hardware and software technology, HPC systems increasingly suffer from …
Stochastic conformal anomaly detection and resolution for air traffic control
Safety is of the utmost importance in the air traffic system. In recent years, data-driven
algorithms have emerged to identify anomalous and potentially unsafe operations based on …
algorithms have emerged to identify anomalous and potentially unsafe operations based on …
InstantOps: A Joint Approach to System Failure Prediction and Root Cause Identification in Microserivces Cloud-Native Applications
As microservice and cloud computing operations increasingly adopt automation, the
importance of models for fostering resilient and efficient adaptive architectures becomes …
importance of models for fostering resilient and efficient adaptive architectures becomes …
Albadross: Active learning based anomaly diagnosis for production hpc systems
Diagnosing causes of performance variations in High-Performance Computing (HPC)
systems is a daunting chal-lenge due to the systems' scale and complexity. Variations in …
systems is a daunting chal-lenge due to the systems' scale and complexity. Variations in …
Method and system for performing effective orchestration of cognitive functions in distributed heterogeneous communication network
R Ravichandran, KM Manoharan… - US Patent …, 2022 - Google Patents
This disclosure relates to method and system for performing effective orchestration of
cognitive functions (CFs) in a distributed heterogeneous communication network. In one …
cognitive functions (CFs) in a distributed heterogeneous communication network. In one …
E2EWatch: an end-to-end anomaly diagnosis framework for production HPC systems
Abstract In today's High-Performance Computing (HPC) systems, application performance
variations are among the most vital challenges as they adversely affect system efficiency …
variations are among the most vital challenges as they adversely affect system efficiency …
Anomaly detection in the context of long-term cloud resource usage planning
P Nawrocki, W Sus - Knowledge and Information Systems, 2022 - Springer
This paper describes a new approach to automatic long-term cloud resource usage
planning with a novel hybrid anomaly detection mechanism. It analyzes existing anomaly …
planning with a novel hybrid anomaly detection mechanism. It analyzes existing anomaly …
Log anomaly to resolution: Ai based proactive incident remediation
R Mahindru, H Kumar, S Bansal - 2021 36th IEEE/ACM …, 2021 - ieeexplore.ieee.org
Based on 2020 SRE report, 80% of SREs work on postmortem analysis of incidents due to
lack of provided information and 16% of toil come from investigating false …
lack of provided information and 16% of toil come from investigating false …
Anomaly detection in scientific datasets using sparse representation
A Moon, M Kim, J Chen, SW Son - Proceedings of the First Workshop on …, 2023 - dl.acm.org
As the size and complexity of high-performance computing (HPC) systems keep growing,
scientists' ability to trust the data produced is paramount due to potential data corruption for …
scientists' ability to trust the data produced is paramount due to potential data corruption for …
Cloud-based autonomic computing framework for securing SCADA systems
This chapter proposes an autonomic computing security framework for protecting cloud-
based supervisory control and data acquisition (SCADA) systems against cyber threats …
based supervisory control and data acquisition (SCADA) systems against cyber threats …