A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …
[HTML][HTML] A comprehensive survey of image augmentation techniques for deep learning
Although deep learning has achieved satisfactory performance in computer vision, a large
volume of images is required. However, collecting images is often expensive and …
volume of images is required. However, collecting images is often expensive and …
Dinov2: Learning robust visual features without supervision
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …
quantities of data have opened the way for similar foundation models in computer vision …
Adaptformer: Adapting vision transformers for scalable visual recognition
Abstract Pretraining Vision Transformers (ViTs) has achieved great success in visual
recognition. A following scenario is to adapt a ViT to various image and video recognition …
recognition. A following scenario is to adapt a ViT to various image and video recognition …
[HTML][HTML] A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications
Data scarcity is a major challenge when training deep learning (DL) models. DL demands a
large amount of data to achieve exceptional performance. Unfortunately, many applications …
large amount of data to achieve exceptional performance. Unfortunately, many applications …
[HTML][HTML] A foundation model for generalizable disease detection from retinal images
Medical artificial intelligence (AI) offers great potential for recognizing signs of health
conditions in retinal images and expediting the diagnosis of eye diseases and systemic …
conditions in retinal images and expediting the diagnosis of eye diseases and systemic …
Prompt, generate, then cache: Cascade of foundation models makes strong few-shot learners
Visual recognition in low-data regimes requires deep neural networks to learn generalized
representations from limited training samples. Recently, CLIP-based methods have shown …
representations from limited training samples. Recently, CLIP-based methods have shown …
Transformer-based unsupervised contrastive learning for histopathological image classification
A large-scale and well-annotated dataset is a key factor for the success of deep learning in
medical image analysis. However, assembling such large annotations is very challenging …
medical image analysis. However, assembling such large annotations is very challenging …
Sequential modeling enables scalable learning for large vision models
We introduce a novel sequential modeling approach which enables learning a Large Vision
Model (LVM) without making use of any linguistic data. To do this we define a common …
Model (LVM) without making use of any linguistic data. To do this we define a common …
Cut and learn for unsupervised object detection and instance segmentation
Abstract We propose Cut-and-LEaRn (CutLER), a simple approach for training
unsupervised object detection and segmentation models. We leverage the property of self …
unsupervised object detection and segmentation models. We leverage the property of self …