[HTML][HTML] Data augmentation approaches in natural language processing: A survey
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …
deep learning techniques may fail. It is widely applied in computer vision then introduced to …
A survey on data augmentation for text classification
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …
transformations, is a widely studied research field across machine learning disciplines …
Flipda: Effective and robust data augmentation for few-shot learning
Most previous methods for text data augmentation are limited to simple tasks and weak
baselines. We explore data augmentation on hard tasks (ie, few-shot natural language …
baselines. We explore data augmentation on hard tasks (ie, few-shot natural language …
Label-specific feature augmentation for long-tailed multi-label text classification
Multi-label text classification (MLTC) involves tagging a document with its most relevant
subset of labels from a label set. In real applications, labels usually follow a long-tailed …
subset of labels from a label set. In real applications, labels usually follow a long-tailed …
Data augmentation using llms: Data perspectives, learning paradigms and challenges
In the rapidly evolving field of machine learning (ML), data augmentation (DA) has emerged
as a pivotal technique for enhancing model performance by diversifying training examples …
as a pivotal technique for enhancing model performance by diversifying training examples …
TreeMix: Compositional constituency-based data augmentation for natural language understanding
Data augmentation is an effective approach to tackle over-fitting. Many previous works have
proposed different data augmentations strategies for NLP, such as noise injection, word …
proposed different data augmentations strategies for NLP, such as noise injection, word …
Substructure substitution: Structured data augmentation for NLP
We study a family of data augmentation methods, substructure substitution (SUB2), for
natural language processing (NLP) tasks. SUB2 generates new examples by substituting …
natural language processing (NLP) tasks. SUB2 generates new examples by substituting …
Genius: Sketch-based language model pre-training via extreme and selective masking for text generation and augmentation
We introduce GENIUS: a conditional text generation model using sketches as input, which
can fill in the missing contexts for a given sketch (key information consisting of textual spans …
can fill in the missing contexts for a given sketch (key information consisting of textual spans …
Learn to resolve conversational dependency: A consistency training framework for conversational question answering
One of the main challenges in conversational question answering (CQA) is to resolve the
conversational dependency, such as anaphora and ellipsis. However, existing approaches …
conversational dependency, such as anaphora and ellipsis. However, existing approaches …
Asking questions like educational experts: Automatically generating question-answer pairs on real-world examination data
Generating high quality question-answer pairs is a hard but meaningful task. Although
previous works have achieved great results on answer-aware question generation, it is …
previous works have achieved great results on answer-aware question generation, it is …