The web is your oyster-knowledge-intensive NLP against a very large web corpus

T Gao, H Yen, J Yu, D Chen - arXiv preprint arXiv:2305.14627, 2023 - arxiv.org

Large language models (LLMs) have emerged as a widely-used tool for information
seeking, but their generated outputs are prone to hallucination. In this work, our aim is to …

被引用次数：138 相关文章所有 7 个版本

[PDF] neurips.cc

Factuality enhanced language models for open-ended text generation

N Lee, W Ping, P Xu, M Patwary… - Advances in …, 2022 - proceedings.neurips.cc

Pretrained language models (LMs) are susceptible to generate text with nonfactual
information. In this work, we measure and improve the factual accuracy of large-scale LMs …

被引用次数：119 相关文章所有 6 个版本

[PDF] neurips.cc

Autoregressive search engines: Generating substrings as document identifiers

M Bevilacqua, G Ottaviano, P Lewis… - Advances in …, 2022 - proceedings.neurips.cc

Abstract Knowledge-intensive language tasks require NLP systems to both provide the
correct answer and retrieve supporting evidence for it in a given corpus. Autoregressive …

被引用次数：128 相关文章所有 8 个版本

[PDF] arxiv.org

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org

Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

被引用次数：107 相关文章所有 4 个版本

[PDF] arxiv.org

Internet-augmented language models through few-shot prompting for open-domain question answering

A Lazaridou, E Gribovskaya, W Stokowiec… - arXiv preprint arXiv …, 2022 - arxiv.org

In this work, we aim to capitalize on the unique few-shot capabilities of large-scale language
models (LSLMs) to overcome some of their challenges with respect to grounding to factual …

被引用次数：170 相关文章所有 3 个版本

[PDF] arxiv.org

Rarr: Researching and revising what language models say, using language models

L Gao, Z Dai, P Pasupat, A Chen, AT Chaganty… - arXiv preprint arXiv …, 2022 - arxiv.org

Language models (LMs) now excel at many tasks such as few-shot learning, question
answering, reasoning, and dialog. However, they sometimes generate unsupported or …

被引用次数：147 相关文章所有 4 个版本

[PDF] arxiv.org

International Workshop on Multimodal Learning-2023 Theme: Multimodal Learning with Foundation Models

Y Ling, F Wu, S Dong, Y Feng, G Karypis… - Proceedings of the 29th …, 2023 - dl.acm.org

The recent advancements in machine learning and artificial intelligence (particularly
foundation models such as BERT, GPT-3, T5, ResNet, etc.) have demonstrated remarkable …

被引用次数：188 相关文章所有 6 个版本

[PDF] arxiv.org

Re2G: Retrieve, rerank, generate

M Glass, G Rossiello, MFM Chowdhury… - arXiv preprint arXiv …, 2022 - arxiv.org

As demonstrated by GPT-3 and T5, transformers grow in capability as parameter spaces
become larger and larger. However, for tasks that require a large amount of knowledge, non …

被引用次数：76 相关文章所有 9 个版本

[PDF] arxiv.org

Temporalwiki: A lifelong benchmark for training and evaluating ever-evolving language models

J Jang, S Ye, C Lee, S Yang, J Shin, J Han… - arXiv preprint arXiv …, 2022 - arxiv.org

Language Models (LMs) become outdated as the world changes; they often fail to perform
tasks requiring recent factual information which was absent or different during training, a …

被引用次数：65 相关文章所有 5 个版本

[PDF] arxiv.org

Expertqa: Expert-curated questions and attributed answers

C Malaviya, S Lee, S Chen, E Sieber, M Yatskar… - arXiv preprint arXiv …, 2023 - arxiv.org

As language models are adapted by a more sophisticated and diverse set of users, the
importance of guaranteeing that they provide factually correct information supported by …

被引用次数：28 相关文章所有 4 个版本