SOMA: Observability, monitoring, and in situ analytics for exascale applications

D Yokelson, O Lappi, S Ramesh… - Concurrency and …, 2024 - Wiley Online Library
With the rise of exascale systems and large, data‐centric workflows, the need to observe
and analyze high performance computing (HPC) applications during their execution is …

Security test generation using threat trees

A Marback, H Do, K He… - 2009 ICSE Workshop …, 2009 - ieeexplore.ieee.org
Software security issues have been a major concern to the cyberspace community, so a
great deal of research on security testing has been performed, and various security testing …

ParLoT: Efficient Whole-Program Call Tracing for HPC Applications

S Taheri, S Devale, G Gopalakrishnan… - … Workshops, ESPT 2017 …, 2019 - Springer
The complexity of HPC software and hardware is quickly increasing. As a consequence, the
need for efficient execution tracing to gain insight into HPC application behavior is steadily …

Window-based, discontinuity preserving stereo

M Agrawal, LS Davis - Proceedings of the 2004 IEEE Computer …, 2004 - ieeexplore.ieee.org
Traditionally, the problem of stereo matching has been addressed either by a local window-
based approach or a dense pixel-based approach using global optimization. In this paper …

Tracing task‐based runtime systems: Feedbacks from the StarPU case

A Denis, E Jeannot, P Swartvagher… - Concurrency and …, 2024 - Wiley Online Library
Given the complexity of current supercomputers and applications, being able to trace
application executions to understand their behavior is not a luxury. As constraints, tracing …

A novel context-based risk assessment approach in vehicular networks

F Ahmad, A Adnane - 2016 30th International Conference on …, 2016 - ieeexplore.ieee.org
Vehicular Networks (VANET) are the largest real life application of ad-hoc networks where
nodes are represented via fast moving vehicles. As VANET is characterised with several …

Distributed wait state tracking for runtime MPI deadlock detection

T Hilbrich, BR de Supinski, WE Nagel, J Protze… - Proceedings of the …, 2013 - dl.acm.org
The widely used Message Passing Interface (MPI) with its multitude of communication
functions is prone to usage errors. Runtime error detection tools aid in the removal of these …

[PDF][PDF] Observability, monitoring, and in situ analytics in exascale applications

D Yokelson, O Lappi, S Ramesh, M Vaisala, K Huck… - Cray User Group, 2023 - cug.org
With the rise of exascale systems and large, datacentric workflows, the need to observe and
analyze high performance computing (HPC) applications during their execution is becoming …

Efficient tracing and performance analysis for large distributed systems

E Anderson, C Hoover, X Li… - 2009 IEEE International …, 2009 - ieeexplore.ieee.org
Distributed systems are notoriously difficult to implement and debug. One important tool for
understanding the behavior of distributed systems is tracing. Unfortunately, effective tracing …

Aggregation of real-time system monitoring data for analyzing large-scale parallel and distributed computing environments

S Bohm, C Engelmann, SL Scott - 2010 IEEE 12th International …, 2010 - ieeexplore.ieee.org
We present a monitoring system for large-scale parallel and distributed computing
environments that allows to trade-off accuracy in a tunable fashion to gain scalability without …