Analysis of community question‐answering issues via machine learning and deep learning: State‐of‐the‐art review

PK Roy, S Saumya, JP Singh… - CAAI Transactions on …, 2023 - Wiley Online Library
Over the last couple of decades, community question‐answering sites (CQAs) have been a
topic of much academic interest. Scholars have often leveraged traditional machine learning …

MTEB: Massive text embedding benchmark

N Muennighoff, N Tazi, L Magne, N Reimers - arXiv preprint arXiv …, 2022 - arxiv.org
Text embeddings are commonly evaluated on a small set of datasets from a single task not
covering their possible applications to other tasks. It is unclear whether state-of-the-art …

A holistic approach to undesired content detection in the real world

T Markov, C Zhang, S Agarwal, FE Nekoul… - Proceedings of the …, 2023 - ojs.aaai.org
We present a holistic approach to building a robust and useful natural language
classification system for real-world content moderation. The success of such a system relies …

Neural unsupervised domain adaptation in NLP---a survey

A Ramponi, B Plank - arXiv preprint arXiv:2006.00632, 2020 - arxiv.org
Deep neural networks excel at learning from labeled data and achieve state-of-the-art
resultson a wide array of Natural Language Processing tasks. In contrast, learning from …

Augmented SBERT: Data augmentation method for improving bi-encoders for pairwise sentence scoring tasks

N Thakur, N Reimers, J Daxenberger… - arXiv preprint arXiv …, 2020 - arxiv.org
There are two approaches for pairwise sentence scoring: Cross-encoders, which perform full-
attention over the input pair, and Bi-encoders, which map each input independently to a …

Interact before align: Leveraging cross-modal knowledge for domain adaptive action recognition

L Yang, Y Huang, Y Sugano… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Unsupervised domain adaptive video action recognition aims to recognize actions of a
target domain using a model trained with only out-of-domain (source) annotations. The …

We need to talk about random splits

A Søgaard, S Ebert, J Bastings, K Filippova - arXiv preprint arXiv …, 2020 - arxiv.org
Gorman and Bedrick (2019) argued for using random splits rather than standard splits in
NLP experiments. We argue that random splits, like standard splits, lead to overly optimistic …

Quantum transfer learning for acceptability judgements

G Buonaiuto, R Guarasci, A Minutolo… - Quantum Machine …, 2024 - Springer
Hybrid quantum-classical classifiers promise to positively impact critical aspects of natural
language processing tasks, particularly classification-related ones. Among the possibilities …

UDALM: Unsupervised domain adaptation through language modeling

C Karouzos, G Paraskevopoulos… - arXiv preprint arXiv …, 2021 - arxiv.org
In this work we explore Unsupervised Domain Adaptation (UDA) of pretrained language
models for downstream tasks. We introduce UDALM, a fine-tuning procedure, using a mixed …

Robust zero-shot cross-domain slot filling with example values

DJ Shah, R Gupta, AA Fayazi… - arXiv preprint arXiv …, 2019 - arxiv.org
Task-oriented dialog systems increasingly rely on deep learning-based slot filling models,
usually needing extensive labeled training data for target domains. Often, however, little to …