Correcting length bias in neural machine translation

J Li, T Tang, WX Zhao, JY Nie, JR Wen - ACM Computing Surveys, 2024 - dl.acm.org

Text Generation aims to produce plausible and readable text in human language from input
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …

被引用次数：262 相关文章所有 7 个版本

[PDF] jair.org

Neural machine translation: A review

F Stahlberg - Journal of Artificial Intelligence Research, 2020 - jair.org

The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …

被引用次数：347 相关文章所有 7 个版本

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arXiv preprint arXiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

被引用次数：1899 相关文章所有 4 个版本

[PDF] neurips.cc

Exploring length generalization in large language models

C Anil, Y Wu, A Andreassen… - Advances in …, 2022 - proceedings.neurips.cc

The ability to extrapolate from short problem instances to longer ones is an important form of
out-of-distribution generalization in reasoning tasks, and is crucial when learning from …

被引用次数：137 相关文章所有 6 个版本

[PDF] arxiv.org

Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation

L Kuhn, Y Gal, S Farquhar - arXiv preprint arXiv:2302.09664, 2023 - arxiv.org

We introduce a method to measure uncertainty in large language models. For tasks like
question answering, it is essential to know when we can trust the natural language outputs …

被引用次数：181 相关文章所有 5 个版本

[PDF] arxiv.org

Robots that ask for help: Uncertainty alignment for large language model planners

AZ Ren, A Dixit, A Bodrova, S Singh, S Tu… - arXiv preprint arXiv …, 2023 - arxiv.org

Large language models (LLMs) exhibit a wide range of promising capabilities--from step-by-
step planning to commonsense reasoning--that may provide utility for robots, but remain …

被引用次数：111 相关文章所有 7 个版本

[HTML] mit.edu

How can we know what language models know?

Z Jiang, FF Xu, J Araki, G Neubig - Transactions of the Association for …, 2020 - direct.mit.edu

Recent work has presented intriguing results examining the knowledge contained in
language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a …

被引用次数：1186 相关文章所有 17 个版本

[HTML] mit.edu

How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering

Z Jiang, J Araki, H Ding, G Neubig - Transactions of the Association …, 2021 - direct.mit.edu

Recent works have shown that language models (LM) capture different types of knowledge
regarding facts or common sense. However, because no model is perfect, they still fail to …

被引用次数：267 相关文章所有 12 个版本

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

被引用次数：80 相关文章所有 6 个版本

[PDF] arxiv.org

Masked language model scoring

J Salazar, D Liang, TQ Nguyen, K Kirchhoff - arXiv preprint arXiv …, 2019 - arxiv.org

Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead,
we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are …

被引用次数：469 相关文章所有 7 个版本