Phrase-based & neural unsupervised machine translation

G Lample, M Ott, A Conneau, L Denoyer… - arXiv preprint arXiv …, 2018 - arxiv.org
Machine translation systems achieve near human-level performance on some languages,
yet their effectiveness strongly relies on the availability of large amounts of parallel …

An Audit on the Perspectives and Challenges of Hallucinations in NLP

PN Venkit, T Chakravorti, V Gupta… - Proceedings of the …, 2024 - aclanthology.org
We audit how hallucination in large language models (LLMs) is characterized in peer-
reviewed literature, using a critical examination of 103 publications across NLP research …

Low-resource neural machine translation: Methods and trends

S Shi, X Wu, R Su, H Huang - ACM Transactions on Asian and Low …, 2022 - dl.acm.org
Neural Machine Translation (NMT) brings promising improvements in translation quality, but
until recently, these models rely on large-scale parallel corpora. As such corpora only exist …

" Confidently Nonsensical?'': A Critical Survey on the Perspectives and Challenges of'Hallucinations' in NLP

PN Venkit, T Chakravorti, V Gupta, H Biggs… - arXiv preprint arXiv …, 2024 - arxiv.org
We investigate how hallucination in large language models (LLM) is characterized in peer-
reviewed literature using a critical examination of 103 publications across NLP research …

What's in a domain? Analyzing genre and topic differences in statistical machine translation

M Van der Wees, A Bisazza, W Weerkamp… - Proceedings of the 53rd …, 2015 - research.rug.nl
Abstract Domain adaptation is an active field of research in statistical machine translation
(SMT), but so far most work has ignored the distinction between the topic and genre of …

Cross-model back-translated distillation for unsupervised machine translation

XP Nguyen, S Joty, TT Nguyen… - … on Machine Learning, 2021 - proceedings.mlr.press
Recent unsupervised machine translation (UMT) systems usually employ three main
principles: initialization, language modeling and iterative back-translation, though they may …

[PDF][PDF] Improving statistical machine translation with a multilingual paraphrase database

RM Seraj, M Siahbani, A Sarkar - Proceedings of the 2015 …, 2015 - aclanthology.org
Abstract The multilingual Paraphrase Database (PPDB) is a freely available automatically
created resource of paraphrases in multiple languages. In statistical machine translation …

[PDF][PDF] Japanese news simplification: tak design, data set construction, and analysis of simplified text

I Goto, H Tanaka, T Kumano - Proceedings of Machine …, 2015 - aclanthology.org
In this paper we explore a Japanese news simplification task. We designed a Japanese
news simplification task, constructed the data set for the task, and analyzed the manual …

Five shades of noise: Analyzing machine translation errors in user-generated text

M Van der Wees, A Bisazza, C Monz - Workshop on Noisy User …, 2015 - research.rug.nl
It is widely accepted that translating user-generated (UG) text is a difficult task for modern
statistical machine translation (SMT) systems. The translation quality metrics typically used …

[PDF][PDF] Measuring and mitigating hallucinations in large language models: amultifaceted approach

X Amatriain - 2024 - amatria.in
ABSTRACT The advent of Large Language Models (LLMs) has ushered in a new era of
possibilities in artificial intelligence, yet it has also introduced the challenge of hallucinations …