A survey on evaluation of large language models
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …
industry, owing to their unprecedented performance in various applications. As LLMs …
A comprehensive survey on test-time adaptation under distribution shifts
Abstract Machine learning methods strive to acquire a robust model during the training
process that can effectively generalize to test samples, even in the presence of distribution …
process that can effectively generalize to test samples, even in the presence of distribution …
Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging
Abstract Machine-learning models for medical tasks can match or surpass the performance
of clinical experts. However, in settings differing from those of the training dataset, the …
of clinical experts. However, in settings differing from those of the training dataset, the …
Towards out-of-distribution generalization: A survey
Traditional machine learning paradigms are based on the assumption that both training and
test data follow the same statistical pattern, which is mathematically referred to as …
test data follow the same statistical pattern, which is mathematically referred to as …
Teaching models to express their uncertainty in words
We show that a GPT-3 model can learn to express uncertainty about its own answers in
natural language--without use of model logits. When given a question, the model generates …
natural language--without use of model logits. When given a question, the model generates …
Exact feature distribution matching for arbitrary style transfer and domain generalization
Arbitrary style transfer (AST) and domain generalization (DG) are important yet challenging
visual learning tasks, which can be cast as a feature distribution matching problem. With the …
visual learning tasks, which can be cast as a feature distribution matching problem. With the …
Robust test-time adaptation in dynamic scenarios
Test-time adaptation (TTA) intends to adapt the pretrained model to test distributions with
only unlabeled test data streams. Most of the previous TTA methods have achieved great …
only unlabeled test data streams. Most of the previous TTA methods have achieved great …
Single-source domain expansion network for cross-scene hyperspectral image classification
Currently, cross-scene hyperspectral image (HSI) classification has drawn increasing
attention. It is necessary to train a model only on source domain (SD) and directly …
attention. It is necessary to train a model only on source domain (SD) and directly …
Self-supervised learning for videos: A survey
The remarkable success of deep learning in various domains relies on the availability of
large-scale annotated datasets. However, obtaining annotations is expensive and requires …
large-scale annotated datasets. However, obtaining annotations is expensive and requires …
Trustworthy AI: From principles to practices
The rapid development of Artificial Intelligence (AI) technology has enabled the deployment
of various systems based on it. However, many current AI systems are found vulnerable to …
of various systems based on it. However, many current AI systems are found vulnerable to …