The lottery ticket hypothesis: Finding sparse, trainable neural networks

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：189 相关文章所有 6 个版本

[PDF] arxiv.org

Parameter-efficient fine-tuning for large models: A comprehensive survey

Z Han, C Gao, J Liu, J Zhang, SQ Zhang - arXiv preprint arXiv:2403.14608, 2024 - arxiv.org

Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …

被引用次数：186 相关文章所有 2 个版本

[PDF] mlsys.org

AWQ: Activation-aware Weight Quantization for On-Device LLM Compression and Acceleration

J Lin, J Tang, H Tang, S Yang… - Proceedings of …, 2024 - proceedings.mlsys.org

Large language models (LLMs) have shown excellent performance on various tasks, but the
astronomical model size raises the hardware barrier for serving (memory size) and slows …

被引用次数：530 相关文章所有 3 个版本

[PDF] aclanthology.org

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org

We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

被引用次数：967 相关文章所有 7 个版本

[PDF] thecvf.com

Multi-concept customization of text-to-image diffusion

N Kumari, B Zhang, R Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

While generative models produce high-quality images of concepts learned from a large-
scale database, a user often wishes to synthesize instantiations of their own concepts (for …

被引用次数：679 相关文章所有 5 个版本

[PDF] mit.edu

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - Transactions of the Association for …, 2024 - direct.mit.edu

Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …

被引用次数：202 相关文章所有 2 个版本

[PDF] mlr.press

Deja vu: Contextual sparsity for efficient llms at inference time

Z Liu, J Wang, T Dao, T Zhou, B Yuan… - International …, 2023 - proceedings.mlr.press

Large language models (LLMs) with hundreds of billions of parameters have sparked a new
wave of exciting AI applications. However, they are computationally expensive at inference …

被引用次数：226 相关文章所有 7 个版本

[PDF] neurips.cc

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc

Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

被引用次数：1551 相关文章所有 10 个版本

[PDF] arxiv.org

A simple and effective pruning approach for large language models

M Sun, Z Liu, A Bair, JZ Kolter - arXiv preprint arXiv:2306.11695, 2023 - arxiv.org

As their size increases, Large Languages Models (LLMs) are natural candidates for network
pruning methods: approaches that drop a subset of network weights while striving to …

被引用次数：379 相关文章所有 5 个版本

[PDF] neurips.cc

Patch diffusion: Faster and more data-efficient training of diffusion models

Z Wang, Y Jiang, H Zheng, P Wang… - Advances in neural …, 2024 - proceedings.neurips.cc

Diffusion models are powerful, but they require a lot of time and data to train. We propose
Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training …

被引用次数：174 相关文章所有 11 个版本