Longformer: The long-document transformer

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com

Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

被引用次数：402 相关文章所有 10 个版本

[HTML] sciencedirect.com

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier

Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

被引用次数：198 相关文章所有 5 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：3160 相关文章所有 4 个版本

[PDF] acm.org

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

L Huang, W Yu, W Ma, W Zhong, Z Feng… - ACM Transactions on …, 2023 - dl.acm.org

The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …

被引用次数：686 相关文章所有 2 个版本

[PDF] mit.edu

Lost in the middle: How language models use long contexts

NF Liu, K Lin, J Hewitt, A Paranjape… - Transactions of the …, 2024 - direct.mit.edu

While recent language models have the ability to take long contexts as input, relatively little
is known about how well they use longer context. We analyze the performance of language …

被引用次数：992 相关文章所有 11 个版本

[PDF] arxiv.org

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

被引用次数：1582 相关文章所有 2 个版本

[PDF] arxiv.org

Unifying large language models and knowledge graphs: A roadmap

S Pan, L Luo, Y Wang, C Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Large language models (LLMs), such as ChatGPT and GPT4, are making new waves in the
field of natural language processing and artificial intelligence, due to their emergent ability …

被引用次数：706 相关文章所有 5 个版本

[PDF] arxiv.org

Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models

P Manakul, A Liusie, MJF Gales - arXiv preprint arXiv:2303.08896, 2023 - arxiv.org

Generative Large Language Models (LLMs) such as GPT-3 are capable of generating highly
fluent responses to a wide variety of user prompts. However, LLMs are known to hallucinate …

被引用次数：550 相关文章所有 7 个版本

[PDF] arxiv.org

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arXiv preprint arXiv …, 2023 - arxiv.org

Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

被引用次数：386 相关文章所有 9 个版本

[PDF] amax.com

Mistral 7B

AQ Jiang, A Sablayrolles, A Mensch, C Bamford… - arXiv preprint arXiv …, 2023 - arxiv.org

We introduce Mistral 7B v0. 1, a 7-billion-parameter language model engineered for
superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all …

被引用次数：1033 相关文章所有 3 个版本