ZFP: A compressed array representation for numerical computations
HPC trends favor algorithms and implementations that reduce data motion relative to
FLOPS. We investigate the use of lossy compressed data arrays in place of traditional IEEE …
FLOPS. We investigate the use of lossy compressed data arrays in place of traditional IEEE …
Accelerating Communication in Deep Learning Recommendation Model Training with Dual-Level Adaptive Lossy Compression
DLRM is a state-of-the-art recommendation system model that has gained widespread
adoption across various industry applications. The large size of DLRM models, however …
adoption across various industry applications. The large size of DLRM models, however …
Designing Converged Middleware for HPC, AI, and Big Data: Challenges and Opportunities
The field of computing has been evolving over the years with the need for High-Performance
Computing (HPC), Deep Learning (DL), and Machine Learning (ML) on heterogeneous …
Computing (HPC), Deep Learning (DL), and Machine Learning (ML) on heterogeneous …
[PDF][PDF] Zfp
P Lindstrom - … Livermore National Laboratory.[Online]. Available: https …, 2015 - ipo.llnl.gov
The zfp software library provides a comprehensive solution to both lossy and lossless data
compression. zfp reduces the storage space of high-precision floating-point data without …
compression. zfp reduces the storage space of high-precision floating-point data without …
[PDF][PDF] KVSort: Drastically Improving LLM Inference Performance via KV Cache Compression
Abstract Large Language Model (LLM) deployment necessitates high inference throughput
due to the increasing demand for text generation. To accelerate inference, the prefill …
due to the increasing demand for text generation. To accelerate inference, the prefill …