Phrase-based & neural unsupervised machine translation
Machine translation systems achieve near human-level performance on some languages,
yet their effectiveness strongly relies on the availability of large amounts of parallel …
yet their effectiveness strongly relies on the availability of large amounts of parallel …
Cheap translation for cross-lingual named entity recognition
Recent work in NLP has attempted to deal with low-resource languages but still assumed a
resource level that is not present for most languages, eg, the availability of Wikipedia in the …
resource level that is not present for most languages, eg, the availability of Wikipedia in the …
Low-resource neural machine translation: Methods and trends
S Shi, X Wu, R Su, H Huang - ACM Transactions on Asian and Low …, 2022 - dl.acm.org
Neural Machine Translation (NMT) brings promising improvements in translation quality, but
until recently, these models rely on large-scale parallel corpora. As such corpora only exist …
until recently, these models rely on large-scale parallel corpora. As such corpora only exist …
Machine translation of low-resource spoken dialects: Strategies for normalizing Swiss German
The goal of this work is to design a machine translation (MT) system for a low-resource
family of dialects, collectively known as Swiss German, which are widely spoken in …
family of dialects, collectively known as Swiss German, which are widely spoken in …
Flow-adapter architecture for unsupervised machine translation
In this work, we propose a flow-adapter architecture for unsupervised NMT. It leverages
normalizing flows to explicitly model the distributions of sentence-level latent …
normalizing flows to explicitly model the distributions of sentence-level latent …
Bilingual lexical extraction based on word alignment for improving corpus search
J Andonovski, B Šandrih, O Kitanović - The Electronic Library, 2019 - emerald.com
Purpose This paper aims to describe the structure of an aligned Serbian-German literary
corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to …
corpus (SrpNemKor) contained in a digital library Bibliša. The goal of the research was to …
Two approaches to compilation of bilingual multi-word terminology lists from lexical resources
In this paper, we present two approaches and the implemented system for bilingual
terminology extraction that rely on an aligned bilingual domain corpus, a terminology …
terminology extraction that rely on an aligned bilingual domain corpus, a terminology …
Learning translations via matrix completion
Bilingual Lexicon Induction is the task of learning word translations without bilingual parallel
corpora. We model this task as a matrix completion problem, and present an effective and …
corpora. We model this task as a matrix completion problem, and present an effective and …
The Impact of Syntactic and Semantic Proximity on Machine Translation with Back-Translation
Unsupervised on-the-fly back-translation, in conjunction with multilingual pretraining, is the
dominant method for unsupervised neural machine translation. Theoretically, however, the …
dominant method for unsupervised neural machine translation. Theoretically, however, the …
[PDF][PDF] Round-trip training approach for bilingually low-resource statistical machine translation systems
ABSTRACT Statistical Machine Translation (SMT) is making good progress in recent years.
Since SMT systems are based on data-driven approach, they learn from millions or even …
Since SMT systems are based on data-driven approach, they learn from millions or even …