Datacomp: In search of the next generation of multimodal datasets

SY Gadre, G Ilharco, A Fang… - Advances in …, 2024 - proceedings.neurips.cc
Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable
Diffusion and GPT-4, yet their design does not receive the same research attention as model …

On feature learning in the presence of spurious correlations

P Izmailov, P Kirichenko, N Gruver… - Advances in Neural …, 2022 - proceedings.neurips.cc
Deep classifiers are known to rely on spurious features—patterns which are correlated with
the target on the training data but not inherently relevant to the learning problem, such as the …

Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation

K Shen, RM Jones, A Kumar, SM Xie… - International …, 2022 - proceedings.mlr.press
We consider unsupervised domain adaptation (UDA), where labeled data from a source
domain (eg, photos) and unlabeled data from a target domain (eg, sketches) are used to …

Wild-time: A benchmark of in-the-wild distribution shift over time

H Yao, C Choi, B Cao, Y Lee… - Advances in Neural …, 2022 - proceedings.neurips.cc
Distribution shifts occur when the test distribution differs from the training distribution, and
can considerably degrade performance of machine learning models deployed in the real …

Artificial intelligence for science in quantum, atomistic, and continuum systems

X Zhang, L Wang, J Helwig, Y Luo, C Fu, Y Xie… - arXiv preprint arXiv …, 2023 - arxiv.org
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural
sciences. Today, AI has started to advance natural sciences by improving, accelerating, and …

Make the u in uda matter: Invariant consistency learning for unsupervised domain adaptation

Z Yue, Q Sun, H Zhang - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Abstract Domain Adaptation (DA) is always challenged by the spurious correlation between
the domain-invariant features (eg, class identity) and the domain-specific ones (eg …

A broad study of pre-training for domain generalization and adaptation

D Kim, K Wang, S Sclaroff, K Saenko - European Conference on Computer …, 2022 - Springer
Deep models must learn robust and transferable representations in order to perform well on
new domains. While domain transfer methods (eg, domain adaptation, domain …

DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery--A Focus on Affinity Prediction Problems with Noise Annotations

Y Ji, L Zhang, J Wu, B Wu, LK Huang, T Xu… - arXiv preprint arXiv …, 2022 - arxiv.org
AI-aided drug discovery (AIDD) is gaining increasing popularity due to its promise of making
the search for new pharmaceuticals quicker, cheaper and more efficient. In spite of its …

Domain adaptation under open set label shift

S Garg, S Balakrishnan… - Advances in Neural …, 2022 - proceedings.neurips.cc
We introduce the problem of domain adaptation under Open Set Label Shift (OSLS), where
the label distribution can change arbitrarily and a new class may arrive during deployment …

Towards federated foundation models: Scalable dataset pipelines for group-structured learning

Z Charles, N Mitchell, K Pillutla… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract We introduce Dataset Grouper, a library to create large-scale group-structured (eg,
federated) datasets, enabling federated learning simulation at the scale of foundation …