Neural machine translation for low-resource languages: A survey
S Ranathunga, ESA Lee, M Prifti Skenduli… - ACM Computing …, 2023 - dl.acm.org
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since
the early 2000s and has already entered a mature phase. While considered the most widely …
the early 2000s and has already entered a mature phase. While considered the most widely …
A survey of data augmentation approaches for NLP
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …
resource domains, new tasks, and the popularity of large-scale neural networks that require …
[HTML][HTML] Data augmentation techniques in natural language processing
LFAO Pellicer, TM Ferreira, AHR Costa - Applied Soft Computing, 2023 - Elsevier
Data Augmentation (DA) methods–a family of techniques designed for synthetic generation
of training data–have shown remarkable results in various Deep Learning and Machine …
of training data–have shown remarkable results in various Deep Learning and Machine …
Progressive transformers for end-to-end sign language production
The goal of automatic Sign Language Production (SLP) is to translate spoken language to a
continuous stream of sign language video at a level comparable to a human translator. If this …
continuous stream of sign language video at a level comparable to a human translator. If this …
A multilingual parallel corpora collection effort for Indian languages
We present sentence aligned parallel corpora across 10 Indian Languages-Hindi, Telugu,
Tamil, Malayalam, Gujarati, Urdu, Bengali, Oriya, Marathi, Punjabi, and English-many of …
Tamil, Malayalam, Gujarati, Urdu, Bengali, Oriya, Marathi, Punjabi, and English-many of …
Diving deep into context-aware neural machine translation
Context-aware neural machine translation (NMT) is a promising direction to improve the
translation quality by making use of the additional context, eg, document-level translation, or …
translation quality by making use of the additional context, eg, document-level translation, or …
Product answer generation from heterogeneous sources: A new benchmark and best practices
It is of great value to answer product questions based on heterogeneous information
sources available on web product pages, eg, semi-structured attributes, text descriptions …
sources available on web product pages, eg, semi-structured attributes, text descriptions …
Best practices and lessons learned on synthetic data for language models
The success of AI models relies on the availability of large, diverse, and high-quality
datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and …
datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and …
A survey of orthographic information in machine translation
Abstract Machine translation is one of the applications of natural language processing which
has been explored in different languages. Recently researchers started paying attention …
has been explored in different languages. Recently researchers started paying attention …
Rethinking label smoothing on multi-hop question answering
Abstract Multi-Hop Question Answering (MHQA) is a significant area in question answering,
requiring multiple reasoning components, including document retrieval, supporting sentence …
requiring multiple reasoning components, including document retrieval, supporting sentence …