How decoding strategies affect the verifiability of generated text

TA Chang, BK Bergen - Computational Linguistics, 2024 - direct.mit.edu

Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …

被引用次数：57 相关文章所有 7 个版本

[PDF] acm.org Full View

The science of detecting llm-generated text

R Tang, YN Chuang, X Hu - Communications of the ACM, 2024 - dl.acm.org

ACM: Digital Library: Communications of the ACM ACM Digital Library Communications of the
ACM Volume 67, Number 4 (2024), Pages 50-59 The Science of Detecting LLM-Generated Text …

被引用次数：137 相关文章所有 5 个版本

[PDF] neurips.cc

Autoregressive search engines: Generating substrings as document identifiers

M Bevilacqua, G Ottaviano, P Lewis… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Knowledge-intensive language tasks require NLP systems to both provide the
correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive …

被引用次数：127 相关文章所有 8 个版本

[PDF] arxiv.org

Survey on factuality in large language models: Knowledge, retrieval and domain-specificity

C Wang, X Liu, Y Yue, X Tang, T Zhang… - arXiv preprint arXiv …, 2023 - arxiv.org

This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …

被引用次数：103 相关文章所有 2 个版本

[PDF] arxiv.org

Autoregressive entity retrieval

N De Cao, G Izacard, S Riedel, F Petroni - arXiv preprint arXiv:2010.00904, 2020 - arxiv.org

Entities are at the center of how we represent and aggregate knowledge. For instance,
Encyclopedias such as Wikipedia are structured by entities (eg, one per Wikipedia article) …

被引用次数：455 相关文章所有 5 个版本

[PDF] arxiv.org

Recipes for building an open-domain chatbot

S Roller, E Dinan, N Goyal, D Ju, M Williamson… - arXiv preprint arXiv …, 2020 - arxiv.org

Building open-domain chatbots is a challenging area for machine learning research. While
prior work has shown that scaling neural models in the number of parameters and the size of …

被引用次数：1028 相关文章所有 5 个版本

[PDF] neurips.cc

Mauve: Measuring the gap between neural text and human text using divergence frontiers

K Pillutla, S Swayamdipta, R Zellers… - Advances in …, 2021 - proceedings.neurips.cc

As major progress is made in open-ended text generation, measuring how close machine-
generated text is to human language remains a critical open problem. We introduce Mauve …

被引用次数：241 相关文章所有 8 个版本

[PDF] arxiv.org

Zerogen: Efficient zero-shot learning via dataset generation

J Ye, J Gao, Q Li, H Xu, J Feng, Z Wu, T Yu… - arXiv preprint arXiv …, 2022 - arxiv.org

There is a growing interest in dataset generation recently due to the superior generative
capacity of large pre-trained language models (PLMs). In this paper, we study a flexible and …

被引用次数：128 相关文章所有 4 个版本

[PDF] arxiv.org

Reframing human-AI collaboration for generating free-text explanations

S Wiegreffe, J Hessel, S Swayamdipta, M Riedl… - arXiv preprint arXiv …, 2021 - arxiv.org

Large language models are increasingly capable of generating fluent-appearing text with
relatively little task-specific supervision. But can these models accurately explain …

被引用次数：124 相关文章所有 4 个版本

[PDF] neurips.cc

Retrieval-augmented generation for knowledge-intensive nlp tasks

P Lewis, E Perez, A Piktus, F Petroni… - Advances in …, 2020 - proceedings.neurips.cc

Large pre-trained language models have been shown to store factual knowledge in their
parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks …

被引用次数：3133 相关文章所有 12 个版本