Compressing large language models by joint sparsification and quantization

J Guo, J Wu, Z Wang, J Liu, G Yang, Y Ding… - … on Machine Learning, 2024 - openreview.net
In this paper, we introduce a novel model compression technique named Joint Sparsification
and Quantization (JSQ), explicitly tailored for large language models (LLMs). Traditional …

Ptsbench: A comprehensive post-training sparsity benchmark towards algorithms and models

Z Wang, J Guo, R Gong, Y Yong, A Liu… - Proceedings of the …, 2024 - dl.acm.org
With the increased attention to model efficiency, post-training sparsity (PTS) has become
more and more prevalent because of its effectiveness and efficiency. However, there remain …

VRDistill: Vote Refinement Distillation for Efficient Indoor 3D Object Detection

Z Yuan, J Guo, D An, J Wu, H Zhu, J Li, X Chen… - Proceedings of the …, 2024 - dl.acm.org
Recently, indoor 3D object detection has shown impressive progress. However, these
improvements have come at the cost of increased memory consumption and longer …

QVD: Post-training Quantization for Video Diffusion Models

S Tian, H Chen, C Lv, Y Liu, J Guo, X Liu, S Li… - Proceedings of the …, 2024 - dl.acm.org
Recently, video diffusion models (VDMs) have garnered significant attention due to their
notable advancements in generating coherent and realistic video content. However …

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

G Yang, C He, J Guo, J Wu, Y Ding, A Liu, H Qin… - arXiv preprint arXiv …, 2024 - arxiv.org
Although large language models (LLMs) have demonstrated their strong intelligence ability,
the high demand for computation and storage hinders their practical application. To this end …

On Efficient Variants of Segment Anything Model: A Survey

X Sun, J Liu, HT Shen, X Zhu, P Hu - arXiv preprint arXiv:2410.04960, 2024 - arxiv.org
The Segment Anything Model (SAM) is a foundational model for image segmentation tasks,
known for its strong generalization across diverse applications. However, its impressive …

Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare

Z Li, J Zhang, Q Gu - arXiv preprint arXiv:2410.01813, 2024 - arxiv.org
The disparity in healthcare personnel expertise and medical resources across different
regions of the world is a pressing social issue. Artificial intelligence technology offers new …

BiDM: Pushing the Limit of Quantization for Diffusion Models

X Zheng, X Liu, Y Bian, X Ma, Y Zhang, J Wang… - The Thirty-eighth Annual … - openreview.net
Diffusion models (DMs) have been significantly developed and widely used in various
applications due to their excellent generative qualities. However, the expensive computation …