[PDF][PDF] Grapheme-to-phoneme models for (almost) any language

A Deri, K Knight - Proceedings of the 54th Annual Meeting of the …, 2016 - aclanthology.org
Abstract Grapheme-to-phoneme (g2p) models are rarely available in low-resource
languages, as the creation of training and evaluation data is expensive and time-consuming …

Aksharantar: Open Indic-language transliteration datasets and models for the next billion users

Y Madhani, S Parthan, P Bedekar, G Nc… - Findings of the …, 2023 - aclanthology.org
Transliteration is very important in the Indian language context due to the usage of multiple
scripts and the widespread use of romanized inputs. However, few training and evaluation …

Design challenges in named entity transliteration

Y Merhav, S Ash - arXiv preprint arXiv:1808.02563, 2018 - arxiv.org
We analyze some of the fundamental design challenges that impact the development of a
multilingual state-of-the-art named entity transliteration system, including curating bilingual …

[PDF][PDF] Cross-language entity linking

P McNamee, J Mayfield, D Lawrie… - Proceedings of 5th …, 2011 - aclanthology.org
There has been substantial recent interest in aligning mentions of named entities in
unstructured texts to knowledge base descriptors, a task commonly called entity linking. This …

[PDF][PDF] Crisis MT: Developing a cookbook for MT in crisis situations

W Lewis, R Munro, S Vogel - … of the Sixth Workshop on Statistical …, 2011 - aclanthology.org
In this paper, we propose that MT is an important technology in crisis events, something that
can and should be an integral part of a rapid-response infrastructure. By integrating MT …

A comprehensive analysis of bilingual lexicon induction

A Irvine, C Callison-Burch - Computational Linguistics, 2017 - direct.mit.edu
Bilingual lexicon induction is the task of inducing word translations from monolingual
corpora in two languages. In this article we present the most comprehensive analysis of …

[PDF][PDF] Supervised bilingual lexicon induction with multiple monolingual signals

A Irvine, C Callison-Burch - … of the 2013 Conference of the North …, 2013 - aclanthology.org
Prior research into learning translations from source and target language monolingual texts
has treated the task as an unsupervised learning problem. Although many techniques take …

End-to-end statistical machine translation with zero or small parallel texts

A Irvine, C Callison-Burch - Natural Language Engineering, 2016 - cambridge.org
We use bilingual lexicon induction techniques, which learn translations from monolingual
texts in two languages, to build an end-to-end statistical machine translation (SMT) system …

An Arabizi-English social media statistical machine translation system

J May, Y Benjira, A Echihabi - … of the 11th Conference of the …, 2014 - aclanthology.org
We present a machine translation engine that can translate romanized Arabic, often known
as Arabizi, into English. With such a system we can, for the first time, translate the massive …

Bootstrapping transliteration with constrained discovery for low-resource languages

S Upadhyay, J Kodner, D Roth - arXiv preprint arXiv:1809.07807, 2018 - arxiv.org
Generating the English transliteration of a name written in a foreign script is an important
and challenging step in multilingual knowledge acquisition and information extraction …