A survey of data augmentation approaches for NLP
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …
resource domains, new tasks, and the popularity of large-scale neural networks that require …
R-drop: Regularized dropout for neural networks
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …
networks. Though effective and performing well, the randomness introduced by dropout …
A scenario-generic neural machine translation data augmentation method
Amid the rapid advancement of neural machine translation, the challenge of data sparsity
has been a major obstacle. To address this issue, this study proposes a general data …
has been a major obstacle. To address this issue, this study proposes a general data …
A simple but tough-to-beat data augmentation approach for natural language understanding and generation
Adversarial training has been shown effective at endowing the learned representations with
stronger generalization ability. However, it typically requires expensive computation to …
stronger generalization ability. However, it typically requires expensive computation to …
Improving neural machine translation by bidirectional training
We present a simple and effective pretraining strategy--bidirectional training (BiT) for neural
machine translation. Specifically, we bidirectionally update the model parameters at the …
machine translation. Specifically, we bidirectionally update the model parameters at the …
Rejuvenating low-frequency words: Making the most of parallel data in non-autoregressive translation
Knowledge distillation (KD) is commonly used to construct synthetic data for training non-
autoregressive translation (NAT) models. However, there exists a discrepancy on low …
autoregressive translation (NAT) models. However, there exists a discrepancy on low …
A survey on low-resource neural machine translation
Neural approaches have achieved state-of-the-art accuracy on machine translation but
suffer from the high cost of collecting large scale parallel data. Thus, a lot of research has …
suffer from the high cost of collecting large scale parallel data. Thus, a lot of research has …
Learning to generalize to more: Continuous semantic augmentation for neural machine translation
The principal task in supervised neural machine translation (NMT) is to learn to generate
target sentences conditioned on the source inputs from a set of parallel sentence pairs, and …
target sentences conditioned on the source inputs from a set of parallel sentence pairs, and …
To augment or not to augment? A comparative study on text augmentation techniques for low-resource NLP
GG Şahin - Computational Linguistics, 2022 - direct.mit.edu
Data-hungry deep neural networks have established themselves as the de facto standard for
many NLP tasks, including the traditional sequence tagging ones. Despite their state-of-the …
many NLP tasks, including the traditional sequence tagging ones. Despite their state-of-the …
Challenges of neural machine translation for short texts
Short texts (STs) present in a variety of scenarios, including query, dialog, and entity names.
Most of the exciting studies in neural machine translation (NMT) are focused on tackling …
Most of the exciting studies in neural machine translation (NMT) are focused on tackling …