Machine learning methods for small data challenges in molecular science
B Dou, Z Zhu, E Merkurjev, L Ke, L Chen… - Chemical …, 2023 - ACS Publications
Small data are often used in scientific and engineering research due to the presence of
various constraints, such as time, cost, ethics, privacy, security, and technical limitations in …
various constraints, such as time, cost, ethics, privacy, security, and technical limitations in …
[HTML][HTML] Data augmentation approaches in natural language processing: A survey
As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where
deep learning techniques may fail. It is widely applied in computer vision then introduced to …
deep learning techniques may fail. It is widely applied in computer vision then introduced to …
A survey of data augmentation approaches for NLP
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …
resource domains, new tasks, and the popularity of large-scale neural networks that require …
A survey of active learning for natural language processing
In this work, we provide a survey of active learning (AL) for its applications in natural
language processing (NLP). In addition to a fine-grained categorization of query strategies …
language processing (NLP). In addition to a fine-grained categorization of query strategies …
Interactive natural language processing
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …
the field of NLP, aimed at addressing limitations in existing frameworks while aligning with …
[HTML][HTML] Data expansion using back translation and paraphrasing for hate speech detection
With proliferation of user generated contents in social media platforms, establishing
mechanisms to automatically identify toxic and abusive content becomes a prime concern …
mechanisms to automatically identify toxic and abusive content becomes a prime concern …
Improving short text classification with augmented data using GPT-3
GPT-3 is a large-scale natural language model developed by OpenAI that can perform many
different tasks, including topic classification. Although researchers claim that it requires only …
different tasks, including topic classification. Although researchers claim that it requires only …
Generative pre-trained transformer (GPT) in research: A systematic review on data augmentation
F Sufi - Information, 2024 - mdpi.com
GPT (Generative Pre-trained Transformer) represents advanced language models that have
significantly reshaped the academic writing landscape. These sophisticated language …
significantly reshaped the academic writing landscape. These sophisticated language …
Mask-then-fill: A flexible and effective data augmentation framework for event extraction
We present Mask-then-Fill, a flexible and effective data augmentation framework for event
extraction. Our approach allows for more flexible manipulation of text and thus can generate …
extraction. Our approach allows for more flexible manipulation of text and thus can generate …
Text autoaugment: Learning compositional augmentation policy for text classification
Data augmentation aims to enrich training samples for alleviating the overfitting issue in low-
resource or class-imbalanced situations. Traditional methods first devise task-specific …
resource or class-imbalanced situations. Traditional methods first devise task-specific …