sys-sage: A Unified Representation of Dynamic Topologies & Attributes on HPC Systems
S Vanecek, M Schulz - Proceedings of the 38th ACM International …, 2024 - dl.acm.org
HPC systems are getting ever more powerful, but this comes at the price of increasing
system complexity: node architectures are deeply hierarchical and in many cases …
system complexity: node architectures are deeply hierarchical and in many cases …
Automating telemetry-and trace-based analytics on large-scale distributed systems
E Ates - 2020 - search.proquest.com
Large-scale distributed systems—such as supercomputers, cloud computing platforms, and
distributed applications—routinely suffer from slowdowns and crashes due to software and …
distributed applications—routinely suffer from slowdowns and crashes due to software and …