Designing high-performance mpi libraries with on-the-fly compression for modern gpu clusters

Q Zhou, C Chu, NS Kumar, P Kousha… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
While the memory bandwidth of accelerators such as GPU has significantly improved over
the last decade, the commodity networks such as Ethernet and InfiniBand are lagging in …

Accelerating mpi all-to-all communication with online compression on modern gpu clusters

Q Zhou, P Kousha, Q Anthony… - … Conference on High …, 2022 - Springer
Abstract As more High-Performance Computing (HPC) and Deep Learning (DL) applications
are adapting to scale using GPUs, the communication of GPU-resident data is becoming …

Data compression for climate data

M Kuhn, JM Kunkel, T Ludwig - Supercomputing frontiers and …, 2016 - centaur.reading.ac.uk
The different rates of increase for computational power and storage capabilities of
supercomputers turn data storage into a technical and economical problem. Because …

MPC: a massively parallel compression algorithm for scientific data

A Yang, H Mukka, F Hesaaraki… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
Due to their high peak performance and energy efficiency, massively parallel accelerators
such as GPUs are quickly spreading in high-performance computing, where large amounts …

BurstZ+: Eliminating the communication bottleneck of scientific computing accelerators via accelerated compression

G Sun, S Kang, SW Jun - ACM Transactions on Reconfigurable …, 2022 - dl.acm.org
We present BurstZ+, an accelerator platform that eliminates the communication bottleneck
between PCIe-attached scientific computing accelerators and their host servers, via …

Accelerating lossy and lossless compression on emerging bluefield dpu architectures

Y Li, A Kashyap, W Chen, Y Guo… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Data compression has become a crucial technique in addressing performance bottlenecks
caused by increasing data volumes in High-Performance Computing (HPC), Big Data, and …

BurstZ: a bandwidth-efficient scientific computing accelerator platform for large-scale data

G Sun, S Kang, SW Jun - Proceedings of the 34th ACM International …, 2020 - dl.acm.org
We present BurstZ, a bandwidth-efficient accelerator platform for scientific computing. While
accelerators such as GPUs and FPGAs provide enormous computing capabilities, their …

Real-time synthesis of compression algorithms for scientific data

M Burtscher, H Mukka, A Yang… - SC'16: Proceedings of …, 2016 - ieeexplore.ieee.org
Many scientific programs produce large amounts of floating-point data that are saved for
later use. To minimize the storage requirement, it is worthwhile to compress such data as …

Adaptive-compi: Enhancing mpi-based applications' performance and scalability by using adaptive compression

R Filgueira, DE Singh, J Carretero… - … Journal of High …, 2011 - journals.sagepub.com
This paper presents an optimization of MPI communication, called Adaptive-CoMPI, based
on runtime compression of MPI messages exchanged by applications. The technique …

Dynamic-CoMPI: Dynamic optimization techniques for MPI parallel applications

R Filgueira, J Carretero, DE Singh, A Calderon… - The Journal of …, 2012 - Springer
This work presents an optimization of MPI communications, called Dynamic-CoMPI, which
uses two techniques in order to reduce the impact of communications and non-contiguous …