Finding order in chaos: A novel data augmentation method for time series in contrastive learning

BU Demirel, C Holz - Advances in Neural Information …, 2024 - proceedings.neurips.cc
The success of contrastive learning is well known to be dependent on data augmentation.
Although the degree of data augmentations has been well controlled by utilizing pre-defined …

A survey of mix-based data augmentation: Taxonomy, methods, applications, and explainability

C Cao, F Zhou, Y Dai, J Wang, K Zhang - arXiv preprint arXiv:2212.10888, 2022 - arxiv.org
Data augmentation (DA) is indispensable in modern machine learning and deep neural
networks. The basic idea of DA is to construct new training data to improve the model's …

Ssvmr: Saliency-based self-training for video-music retrieval

X Cheng, Z Zhu, H Li, Y Li, Y Zou - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
With the rise of short videos, the demand for selecting appropriate background music (BGM)
for a video has increased significantly, video-music retrieval (VMR) task gradually draws …

On the calibration of pre-trained language models using mixup guided by area under the margin and saliency

SY Park, C Caragea - arXiv preprint arXiv:2203.07559, 2022 - arxiv.org
A well-calibrated neural model produces confidence (probability outputs) closely
approximated by the expected accuracy. While prior studies have shown that mixup training …

On the domain adaptation and generalization of pretrained language models: A survey

X Guo, H Yu - arXiv preprint arXiv:2211.03154, 2022 - arxiv.org
Recent advances in NLP are brought by a range of large-scale pretrained language models
(PLMs). These PLMs have brought significant performance gains for a range of NLP tasks …

TreeMix: Compositional constituency-based data augmentation for natural language understanding

L Zhang, Z Yang, D Yang - arXiv preprint arXiv:2205.06153, 2022 - arxiv.org
Data augmentation is an effective approach to tackle over-fitting. Many previous works have
proposed different data augmentations strategies for NLP, such as noise injection, word …

Improving the sample efficiency of prompt tuning with domain adaptation

X Guo, B Li, H Yu - arXiv preprint arXiv:2210.02952, 2022 - arxiv.org
Prompt tuning, or the conditioning of a frozen pretrained language model (PLM) with soft
prompts learned from data, has demonstrated impressive performance on a wide range of …

DoubleMix: Simple interpolation-based data augmentation for text classification

H Chen, W Han, D Yang, S Poria - arXiv preprint arXiv:2209.05297, 2022 - arxiv.org
This paper proposes a simple yet effective interpolation-based data augmentation approach
termed DoubleMix, to improve the robustness of models in text classification. DoubleMix first …

Geodesic multi-modal mixup for robust fine-tuning

C Oh, J So, H Byun, YT Lim, M Shin… - Advances in Neural …, 2024 - proceedings.neurips.cc
Pre-trained multi-modal models, such as CLIP, provide transferable embeddings and show
promising results in diverse applications. However, the analysis of learned multi-modal …

Data augmentation for conversational ai

H Soudani, E Kanoulas, F Hasibi - Proceedings of the 32nd ACM …, 2023 - dl.acm.org
Advancements in conversational systems have revolutionized information access,
surpassing the limitations of single queries. However, developing dialogue systems requires …