[HTML][HTML] Data augmentation approaches in natural language processing: A survey

B Li, Y Hou, W Che - Ai Open, 2022 - Elsevier
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …

A survey on data augmentation for text classification

M Bayer, MA Kaufhold, C Reuter - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation, the artificial creation of training data for machine learning by
transformations, is a widely studied research field across machine learning disciplines …

[HTML][HTML] A review of semi-supervised learning for text classification

JM Duarte, L Berton - Artificial intelligence review, 2023 - Springer
A huge amount of data is generated daily leading to big data challenges. One of them is
related to text mining, especially text classification. To perform this task we usually need a …

A survey of mix-based data augmentation: Taxonomy, methods, applications, and explainability

C Cao, F Zhou, Y Dai, J Wang, K Zhang - ACM Computing Surveys, 2022 - dl.acm.org
Data augmentation (DA) is indispensable in modern machine learning and deep neural
networks. The basic idea of DA is to construct new training data to improve the model's …

[HTML][HTML] Generative pre-trained transformer (GPT) in research: A systematic review on data augmentation

F Sufi - Information, 2024 - mdpi.com
GPT (Generative Pre-trained Transformer) represents advanced language models that have
significantly reshaped the academic writing landscape. These sophisticated language …

Toward text data augmentation for sentiment analysis

HQ Abonizio, EC Paraiso… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
A significant part of natural language processing (NLP) techniques for sentiment analysis is
based on supervised methods, which are affected by the quality of data. Therefore …

Is ChatGPT the ultimate Data Augmentation Algorithm?

F Piedboeuf, P Langlais - Findings of the Association for …, 2023 - aclanthology.org
In the aftermath of GPT-3.5, commonly known as ChatGPT, research have attempted to
assess its capacity for lowering annotation cost, either by doing zero-shot learning …

Question classification using limited labelled data

C Mallikarjuna, S Sivanesan - Information Processing & Management, 2022 - Elsevier
Question classification (QC) involves classifying given question based on the expected
answer type and is an important task in the Question Answering (QA) system. Existing …

Multi-level fine-tuning, data augmentation, and few-shot learning for specialized cyber threat intelligence

M Bayer, T Frey, C Reuter - Computers & Security, 2023 - Elsevier
Gathering cyber threat intelligence from open sources is becoming increasingly important for
maintaining and achieving a high level of security as systems become larger and more …

SentiGEN: Synthetic Data Generator for Sentiment Analysis

P Sundarreson… - Journal of Computing …, 2024 - publikasi2.dinus.ac.id
Obtaining high-quality, diverse, accurate datasets for sentiment analysis has always been a
significant challenge. Traditional approaches include annotators, which may introduce bias …