Cloud-native computing: A survey from the perspective of services

S Deng, H Zhao, B Huang, C Zhang… - Proceedings of the …, 2024 - ieeexplore.ieee.org
The development of cloud computing delivery models inspires the emergence of cloud-
native computing. Cloud-native computing, as the most influential development principle for …

tprof: Performance profiling via structural aggregation and automated analysis of distributed systems traces

L Huang, T Zhu - Proceedings of the ACM Symposium on Cloud …, 2021 - dl.acm.org
The traditional approach for performance debugging relies upon performance profilers (eg,
gprof, VTune) that provide average function runtime information. These aggregate statistics …

An Empirical Study of High Performance Computing (HPC) Performance Bugs

MAK Azad, N Iqbal, F Hassan… - 2023 IEEE/ACM 20th …, 2023 - ieeexplore.ieee.org
Performance efficiency and scalability are the major design goals for high performance
computing (HPC) applications. However, it is challenging to achieve high efficiency and …

Vapro: Performance variance detection and diagnosis for production-run parallel applications

L Zheng, J Zhai, X Tang, H Wang, T Yu, Y Jin… - Proceedings of the 27th …, 2022 - dl.acm.org
Performance variance is a serious problem for parallel applications, which can cause
performance degradation and make applications' behavior hard to understand. Therefore …

Break dancing: low overhead, architecture neutral software branch tracing

G Marin, A Alexandrov, T Moseley - … of the 22nd ACM SIGPLAN/SIGBED …, 2021 - dl.acm.org
Sampling-based Feedback Directed Optimization (FDO) methods like AutoFDO and BOLT
that employ profiles collected in live production environments, are commonly used in …

Optimistic concurrency control for real-world go programs

Z Zhang, M Chabbi, A Welc, T Sherwood - 2021 USENIX Annual …, 2021 - usenix.org
We present a source-to-source transformation framework, Gocc, that consumes lock-based
pessimistic concurrency programs in the Go language and transforms them into optimistic …

Detecting performance variance for parallel applications without source code

J Zhai, L Zheng, F Zhang, X Tang… - … on Parallel and …, 2022 - ieeexplore.ieee.org
For parallel applications, performance variance is a critical issue that can degrade
performance and make applications' behavior difficult to explain. Therefore, users and …

[PDF][PDF] Performance Measurement, Analysis, and Optimization of GPU-accelerated Applications

K Zhou - 2022 - repository.rice.edu
The computing landscape is undergoing rapid evolution to meet the demand in
dataintensive applications and grand challenging scientific problems. Figure 1.1 illustrates …

ELS: Emulation system for debugging and tuning large-scale parallel programs on small clusters

F Lin, Y Liu, Y Guo, D Qian - The Journal of Supercomputing, 2021 - Springer
Continuous scaling-up of high-performance computing systems has brought challenges to
the debugging and tuning of large-scale parallel programs. Firstly, to locate bugs in a …

Production-Run Noise Detection

J Zhai, Y Jin, W Chen, W Zheng - … Analysis of Parallel Applications for HPC, 2023 - Springer
The performance variance detection approach in Chap. 7 relies on nontrivial source code
analysis that is impractical for production-run parallel applications. In this chapter, we further …