Using Language Models on Low-end Hardware- 学术资源搜索

文章

学术资源搜索

Using Language Models on Low-end Hardware

F Ziegner, J Borst, A Niekler, M Potthast - arXiv preprint arXiv:2305.02350, 2023 - arxiv.org

F Ziegner, J Borst, A Niekler, M Potthast

arXiv preprint arXiv:2305.02350, 2023•arxiv.org

This paper evaluates the viability of using fixed language models for training text
classification networks on low-end hardware. We combine language models with a CNN
architecture and put together a comprehensive benchmark with 8 datasets covering single-
label and multi-label classification of topic, sentiment, and genre. Our observations are
distilled into a list of trade-offs, concluding that there are scenarios, where not fine-tuning a
language model yields competitive effectiveness at faster training, requiring only a quarter of …

This paper evaluates the viability of using fixed language models for training text classification networks on low-end hardware. We combine language models with a CNN architecture and put together a comprehensive benchmark with 8 datasets covering single-label and multi-label classification of topic, sentiment, and genre. Our observations are distilled into a list of trade-offs, concluding that there are scenarios, where not fine-tuning a language model yields competitive effectiveness at faster training, requiring only a quarter of the memory compared to fine-tuning.

arxiv.org

展开收起

被引用次数：1 相关文章所有 4 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果