Identifying HPC codes via performance logs and machine learning

O DeMasi, T Samak, DH Bailey - Proceedings of the first workshop on …, 2013 - dl.acm.org
We aim here to leverage supervised learning to enable large-scale analysis of performance
logs, in order to accurately classify code runs and understand the importance of different …

Visualizing distributed memory computations with hive plots

S Engle, S Whalen - Proceedings of the Ninth International Symposium …, 2012 - dl.acm.org
A hive plot is a network layout algorithm that uses a parallel coordinate plot in which axes
are radially arranged and node position is based on structural properties of that node [8]. We …

ASCR Cybersecurity for Scientific Computing Integrity

S Piesert - 2015 - escholarship.org
The Department of Energy (DOE) has the responsibility to address the energy,
environmental, and nuclear security challenges that face our nation. Much of DOE's …

Characterizing loop-level communication patterns in shared memory

A Mazaheri, A Jannesari, A Mirzaei… - 2015 44th International …, 2015 - ieeexplore.ieee.org
Communication patterns extracted from parallel programs can provide a valuable source of
information for parallel pattern detection, application auto-tuning, and runtime workload …

Fingerprinting anomalous computation with RNN for GPU-accelerated HPC machines

P Zou, A Li, K Barker, R Ge - 2019 IEEE International …, 2019 - ieeexplore.ieee.org
This paper presents a workload classification framework that discriminates illicit computation
from authorized workloads on GPU-accelerated HPC systems. As such heterogeneous …

Multiclass classification of distributed memory parallel computations

S Whalen, S Peisert, M Bishop - Pattern Recognition Letters, 2013 - Elsevier
High Performance Computing (HPC) is a field concerned with solving large-scale problems
in science and engineering. However, the computational infrastructure of HPC systems can …

Towards Real-Time Classification of HPC Workloads via Out-of-band Telemetry

S Presser - 2022 IEEE International Conference on Cluster …, 2022 - ieeexplore.ieee.org
Detecting illicit workloads on High Performance Computing (HPC) systems is an important
task. Such workloads might indicate a user account is being misused, for example to run …

Unveiling thread communication bottlenecks using hardware-independent metrics

A Mazaheri, F Wolf, A Jannesari - Proceedings of the 47th International …, 2018 - dl.acm.org
A critical factor for developing robust shared-memory applications is the efficient use of the
cache and the communication between threads. Inappropriate data structures, algorithm …

Detecting anomalous computation with rnns on gpu-accelerated hpc machines

P Zou, A Li, K Barker, R Ge - … of the 49th International Conference on …, 2020 - dl.acm.org
This paper presents a workload classification framework that accurately discriminates illicit
computation from authorized workloads on GPU-accelerated HPC systems at runtime. As …

Catch me if you can: Using power analysis to identify HPC activity

B Copos, S Peisert - arXiv preprint arXiv:2005.03135, 2020 - arxiv.org
Monitoring users on large computing platforms such as high performance computing (HPC)
and cloud computing systems is non-trivial. Utilities such as process viewers provide limited …