Glam: Efficient scaling of language models with mixture-of-experts

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …

被引用次数：430 相关文章所有 2 个版本

[PDF] arxiv.org

A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

被引用次数：285 相关文章所有 3 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：2093 相关文章所有 4 个版本

[HTML] nature.com

[HTML][HTML] Large language models encode clinical knowledge

K Singhal, S Azizi, T Tu, SS Mahdavi, J Wei, HW Chung… - Nature, 2023 - nature.com

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for
clinical applications is high. Attempts to assess the clinical knowledge of models typically …

被引用次数：1104 相关文章所有 10 个版本

[PDF] arxiv.org

Palm 2 technical report

R Anil, AM Dai, O Firat, M Johnson, D Lepikhin… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and
reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is …

被引用次数：1124 相关文章所有 2 个版本

[PDF] mlr.press

Smoothquant: Accurate and efficient post-training quantization for large language models

G Xiao, J Lin, M Seznec, H Wu… - International …, 2023 - proceedings.mlr.press

Large language models (LLMs) show excellent performance but are compute-and memory-
intensive. Quantization can reduce memory and accelerate inference. However, existing …

被引用次数：455 相关文章所有 7 个版本

[PDF] acm.org

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H Jin, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org

This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

被引用次数：398 相关文章所有 6 个版本

[PDF] arxiv.org

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

被引用次数：819 相关文章所有 2 个版本

[PDF] neurips.cc

Language models can solve computer tasks

G Kim, P Baldi, S McAleer - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Agents capable of carrying out general tasks on a computer can improve efficiency and
productivity by automating repetitive tasks and assisting in complex problem-solving. Ideally …

被引用次数：202 相关文章所有 6 个版本

[PDF] arxiv.org

Pali: A jointly-scaled multilingual language-image model

X Chen, X Wang, S Changpinyo… - arXiv preprint arXiv …, 2022 - arxiv.org

Effective scaling and a flexible task interface enable large language models to excel at many
tasks. We present PaLI (Pathways Language and Image model), a model that extends this …

被引用次数：506 相关文章所有 6 个版本