Document-level neural MT: A systematic comparison

Large language models effectively leverage document-level context for literary translation, but critical errors persist

M Karpinska, M Iyyer - arXiv preprint arXiv:2304.03245, 2023 - arxiv.org

Large language models (LLMs) are competitive with the state of the art on a wide range of
sentence-level translation datasets. However, their ability to translate paragraphs and …

被引用次数：61 相关文章所有 6 个版本

[PDF] arxiv.org

A survey on zero pronoun translation

L Wang, S Liu, M Xu, L Song, S Shi, Z Tu - arXiv preprint arXiv:2305.10196, 2023 - arxiv.org

Zero pronouns (ZPs) are frequently omitted in pro-drop languages (eg Chinese, Hungarian,
and Hindi), but should be recalled in non-pro-drop languages (eg English). This …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

Discoscore: Evaluating text generation with bert and discourse coherence

W Zhao, M Strube, S Eger - arXiv preprint arXiv:2201.11176, 2022 - arxiv.org

Recently, there has been a growing interest in designing text generation systems from a
discourse coherence perspective, eg, modeling the interdependence between sentences …

被引用次数：35 相关文章所有 3 个版本

[PDF] arxiv.org

Measuring and increasing context usage in context-aware machine translation

P Fernandes, K Yin, G Neubig, AFT Martins - arXiv preprint arXiv …, 2021 - arxiv.org

Recent work in neural machine translation has demonstrated both the necessity and
feasibility of using inter-sentential context--context from sentences other than those currently …

被引用次数：49 相关文章所有 4 个版本

[PDF] arxiv.org

Investigating the translation performance of a large multilingual language model: the case of bloom

R Bawden, F Yvon - arXiv preprint arXiv:2303.01911, 2023 - arxiv.org

The NLP community recently saw the release of a new large open-access multilingual
language model, BLOOM (BigScience et al., 2022) covering 46 languages. We focus on …

被引用次数：39 相关文章所有 28 个版本

[PDF] arxiv.org

Do context-aware translation models pay the right attention?

K Yin, P Fernandes, D Pruthi, A Chaudhary… - arXiv preprint arXiv …, 2021 - arxiv.org

Context-aware machine translation models are designed to leverage contextual information,
but often fail to do so. As a result, they inaccurately disambiguate pronouns and polysemous …

被引用次数：33 相关文章所有 6 个版本

[PDF] aclanthology.org

Findings of the WMT 2020 shared task on chat translation

MA Farajian, AV Lopes, AFT Martins… - Proceedings of the …, 2020 - aclanthology.org

We report the results of the first edition of the WMT shared task on chat translation. The task
consisted of translating bilingual conversational text, in particular customer support chats for …

被引用次数：34 相关文章所有 6 个版本

[PDF] arxiv.org

Embarrassingly easy document-level MT metrics: How to convert any pretrained metric into a document-level metric

G Vernikos, B Thompson, P Mathur… - arXiv preprint arXiv …, 2022 - arxiv.org

We hypothesize that existing sentence-level machine translation (MT) metrics become less
effective when the human reference contains ambiguities. To verify this hypothesis, we …

被引用次数：15 相关文章所有 9 个版本

[PDF] arxiv.org

Quantifying the plausibility of context reliance in neural machine translation

G Sarti, G Chrupała, M Nissim, A Bisazza - arXiv preprint arXiv …, 2023 - arxiv.org

Establishing whether language models can use contextual information in a human-plausible
way is important to ensure their safe adoption in real-world settings. However, the questions …

被引用次数：4 相关文章所有 5 个版本

[PDF] arxiv.org

Document-level language models for machine translation

F Petrick, C Herold, P Petrushkov, S Khadivi… - arXiv preprint arXiv …, 2023 - arxiv.org

Despite the known limitations, most machine translation systems today still operate on the
sentence-level. One reason for this is, that most parallel training data is only sentence-level …

被引用次数：4 相关文章所有 5 个版本