- 学术资源搜索

Dissociating language and thought in large language models

K Mahowald, AA Ivanova, IA Blank, N Kanwisher… - Trends in Cognitive …, 2024 - cell.com

Large language models (LLMs) have come closest among all models to date to mastering
human language, yet opinions about their linguistic and cognitive capabilities remain split …

被引用次数：327 相关文章所有 10 个版本

[PDF] arxiv.org

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arXiv preprint arXiv …, 2023 - arxiv.org

Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

被引用次数：337 相关文章所有 3 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：2407 相关文章所有 4 个版本

[PDF] arxiv.org

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arXiv preprint arXiv …, 2023 - arxiv.org

While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

被引用次数：597 相关文章所有 2 个版本

[PDF] neurips.cc

Inference-time intervention: Eliciting truthful answers from a language model

K Li, O Patel, F Viégas, H Pfister… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract We introduce Inference-Time Intervention (ITI), a technique designed to enhance
the" truthfulness" of large language models (LLMs). ITI operates by shifting model activations …

被引用次数：186 相关文章所有 6 个版本

[PDF] arxiv.org

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arXiv preprint arXiv …, 2023 - arxiv.org

Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

被引用次数：285 相关文章所有 9 个版本

[PDF] thecvf.com

Erasing concepts from diffusion models

R Gandikota, J Materzynska… - Proceedings of the …, 2023 - openaccess.thecvf.com

Motivated by concerns that large-scale diffusion models can produce undesirable output
such as sexually explicit content or copyrighted artistic styles, we study erasure of specific …

被引用次数：172 相关文章所有 5 个版本

[PDF] acm.org Full View

Talking about large language models

M Shanahan - Communications of the ACM, 2024 - dl.acm.org

Talking about Large Language Models Page 1 key insights ˽ As LLMs become more powerful,
it becomes increasingly tempting to describe LLM-based dialog agents in human-like terms …

被引用次数：277 相关文章所有 5 个版本

[PDF] neurips.cc

Towards automated circuit discovery for mechanistic interpretability

A Conmy, A Mavor-Parker, A Lynch… - Advances in …, 2023 - proceedings.neurips.cc

Through considerable effort and intuition, several recent works have reverse-engineered
nontrivial behaviors oftransformer models. This paper systematizes the mechanistic …

被引用次数：133 相关文章所有 6 个版本

[PDF] arxiv.org

Mass-editing memory in a transformer

K Meng, AS Sharma, A Andonian, Y Belinkov… - arXiv preprint arXiv …, 2022 - arxiv.org

Recent work has shown exciting promise in updating large language models with new
memories, so as to replace obsolete information or add specialized knowledge. However …

被引用次数：320 相关文章所有 5 个版本