Id and ood performance are sometimes inversely correlated on real-world datasets

D Teney, Y Lin, SJ Oh… - Advances in Neural …, 2024 - proceedings.neurips.cc
Several studies have compared the in-distribution (ID) and out-of-distribution (OOD)
performance of models in computer vision and NLP. They report a frequent positive …

Towards logiglue: A brief survey and a benchmark for analyzing logical reasoning capabilities of language models

M Luo, S Kumbhar, M Parmar, N Varshney… - arXiv preprint arXiv …, 2023 - arxiv.org
Logical reasoning is fundamental for humans yet presents a substantial challenge in the
domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and …

AI robustness: a human-centered perspective on technological challenges and opportunities

A Tocchetti, L Corti, A Balayn, M Yurrita… - ACM Computing …, 2022 - dl.acm.org
Despite the impressive performance of Artificial Intelligence (AI) systems, their robustness
remains elusive and constitutes a key issue that impedes large-scale adoption. Besides …

On the adversarial robustness of out-of-distribution generalization models

X Zou, W Liu - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc
Abstract Out-of-distribution (OOD) generalization has attracted increasing research attention
in recent years, due to its promising experimental results in real-world applications …

A survey on out-of-distribution evaluation of neural nlp models

X Li, M Liu, S Gao, W Buntine - arXiv preprint arXiv:2306.15261, 2023 - arxiv.org
Adversarial robustness, domain generalization and dataset biases are three active lines of
research contributing to out-of-distribution (OOD) evaluation on neural NLP models …

Adversarial Bayesian augmentation for single-source domain generalization

S Cheng, T Gokhale, Y Yang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Generalizing to unseen image domains is a challenging problem primarily due to the lack of
diverse training data, inaccessible target data, and the large domain shift that may exist in …

Choose your qa model wisely: A systematic study of generative and extractive readers for question answering

M Luo, K Hashimoto, S Yavuz, Z Liu, C Baral… - arXiv preprint arXiv …, 2022 - arxiv.org
While both extractive and generative readers have been successfully applied to the
Question Answering (QA) task, little attention has been paid toward the systematic …

Biotabqa: Instruction learning for biomedical table question answering

M Luo, S Saxena, S Mishra, M Parmar… - arXiv preprint arXiv …, 2022 - arxiv.org
Table Question Answering (TQA) is an important but under-explored task. Most of the
existing QA datasets are in unstructured text format and only few of them use tables as the …

Evaluating human-ai collaboration: A review and methodological framework

G Fragiadakis, C Diou, G Kousiouris… - arXiv preprint arXiv …, 2024 - arxiv.org
The use of artificial intelligence (AI) in working environments with individuals, known as
Human-AI Collaboration (HAIC), has become essential in a variety of domains, boosting …

Robust source-free domain adaptation for fundus image segmentation

L Li, Y Zhou, G Yang - Proceedings of the IEEE/CVF Winter …, 2024 - openaccess.thecvf.com
Abstract Unsupervised Domain Adaptation (UDA) is a learning technique that transfers
knowledge learned in the source domain from labelled training data to the target domain …