A survey of controllable text generation using transformer-based pre-trained language models
Controllable Text Generation (CTG) is an emerging area in the field of natural language
generation (NLG). It is regarded as crucial for the development of advanced text generation …
generation (NLG). It is regarded as crucial for the development of advanced text generation …
[HTML][HTML] Data augmentation approaches in natural language processing: A survey
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …
deep learning techniques may fail. It is widely applied in computer vision then introduced to …
A survey of data augmentation approaches for NLP
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …
resource domains, new tasks, and the popularity of large-scale neural networks that require …
A survey on data augmentation for text classification
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …
transformations, is a widely studied research field across machine learning disciplines …
FUDGE: Controlled text generation with future discriminators
K Yang, D Klein - arXiv preprint arXiv:2104.05218, 2021 - arxiv.org
We propose Future Discriminators for Generation (FUDGE), a flexible and modular method
for controlled text generation. Given a pre-existing model G for generating text from a …
for controlled text generation. Given a pre-existing model G for generating text from a …
AEDA: an easier data augmentation technique for text classification
This paper proposes AEDA (An Easier Data Augmentation) technique to help improve the
performance on text classification tasks. AEDA includes only random insertion of …
performance on text classification tasks. AEDA includes only random insertion of …
Increasing diversity while maintaining accuracy: Text data generation with large language models and human interventions
Large language models (LLMs) can be used to generate text data for training and evaluating
other models. However, creating high-quality datasets with LLMs can be challenging. In this …
other models. However, creating high-quality datasets with LLMs can be challenging. In this …
[HTML][HTML] Data augmentation techniques in natural language processing
LFAO Pellicer, TM Ferreira, AHR Costa - Applied Soft Computing, 2023 - Elsevier
Data Augmentation (DA) methods–a family of techniques designed for synthetic generation
of training data–have shown remarkable results in various Deep Learning and Machine …
of training data–have shown remarkable results in various Deep Learning and Machine …
Mitigating political bias in language models through reinforced calibration
Current large-scale language models can be politically biased as a result of the data they
are trained on, potentially causing serious problems when they are deployed in real-world …
are trained on, potentially causing serious problems when they are deployed in real-world …
Improving short text classification with augmented data using GPT-3
GPT-3 is a large-scale natural language model developed by OpenAI that can perform many
different tasks, including topic classification. Although researchers claim that it requires only …
different tasks, including topic classification. Although researchers claim that it requires only …