ZFP: A compressed array representation for numerical computations

P Lindstrom, J Hittinger, J Diffenderfer… - … Journal of High …, 2024 - journals.sagepub.com
HPC trends favor algorithms and implementations that reduce data motion relative to
FLOPS. We investigate the use of lossy compressed data arrays in place of traditional IEEE …

Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression

H Feng, B Zhang, F Ye, M Si, CH Chu… - … Conference for High …, 2024 - ieeexplore.ieee.org
DLRM is a state-of-the-art recommendation system model that has gained widespread
adoption across various industry applications. The large size of DLRM models, however …

Designing Converged Middleware for HPC, AI, and Big Data: Challenges and Opportunities

DK Panda, H Subramoni, M Abduljabbar… - … Conference of Cloud …, 2024 - Springer
The field of computing has been evolving over the years with the need for High-Performance
Computing (HPC), Deep Learning (DL), and Machine Learning (ML) on heterogeneous …

[PDF][PDF] Zfp

P Lindstrom - … Livermore National Laboratory.[Online]. Available: https …, 2015 - ipo.llnl.gov
The zfp software library provides a comprehensive solution to both lossy and lossless data
compression. zfp reduces the storage space of high-precision floating-point data without …

[PDF][PDF] KVSort: Drastically Improving LLM Inference Performance via KV Cache Compression

B Sun, X Yu, D Tao - sc24.supercomputing.org
Abstract Large Language Model (LLM) deployment necessitates high inference throughput
due to the increasing demand for text generation. To accelerate inference, the prefill …