A systematic review on affective computing: Emotion models, databases, and recent advances

Y Wang, W Song, W Tao, A Liotta, D Yang, X Li, S Gao… - Information …, 2022 - Elsevier
Affective computing conjoins the research topics of emotion recognition and sentiment
analysis, and can be realized with unimodal or multimodal data, consisting primarily of …

[HTML][HTML] Emotion recognition and artificial intelligence: A systematic review (2014–2023) and research recommendations

SK Khare, V Blanes-Vidal, ES Nadimi, UR Acharya - Information Fusion, 2023 - Elsevier
Emotion recognition is the ability to precisely infer human emotions from numerous sources
and modalities using questionnaires, physical signals, and physiological signals. Recently …

Eva-clip: Improved training techniques for clip at scale

Q Sun, Y Fang, L Wu, X Wang, Y Cao - arXiv preprint arXiv:2303.15389, 2023 - arxiv.org
Contrastive language-image pre-training, CLIP for short, has gained increasing attention for
its potential in various scenarios. In this paper, we propose EVA-CLIP, a series of models …

Vision-language models for vision tasks: A survey

J Zhang, J Huang, S Jin, S Lu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …

Eva-02: A visual representation for neon genesis

Y Fang, Q Sun, X Wang, T Huang, X Wang… - Image and Vision …, 2024 - Elsevier
We launch EVA-02, a next-generation Transformer-based visual representation pre-trained
to reconstruct strong and robust language-aligned vision features via masked image …

Slip: Self-supervision meets language-image pre-training

N Mu, A Kirillov, D Wagner, S Xie - European conference on computer …, 2022 - Springer
Recent work has shown that self-supervised pre-training leads to improvements over
supervised learning on challenging visual recognition tasks. CLIP, an exciting new …

Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

Z Chen, J Wu, W Wang, W Su, G Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
The exponential growth of large language models (LLMs) has opened up numerous
possibilities for multi-modal AGI systems. However the progress in vision and vision …

Learning transferable visual models from natural language supervision

A Radford, JW Kim, C Hallacy… - International …, 2021 - proceedings.mlr.press
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined
object categories. This restricted form of supervision limits their generality and usability since …

[HTML][HTML] Federated learning for secure IoMT-applications in smart healthcare systems: A comprehensive review

S Rani, A Kataria, S Kumar, P Tiwari - Knowledge-based systems, 2023 - Elsevier
Recent developments in the Internet of Things (IoT) and various communication
technologies have reshaped numerous application areas. Nowadays, IoT is assimilated into …

Learn from all: Erasing attention consistency for noisy label facial expression recognition

Y Zhang, C Wang, X Ling, W Deng - European Conference on Computer …, 2022 - Springer
Abstract Noisy label Facial Expression Recognition (FER) is more challenging than
traditional noisy label classification tasks due to the inter-class similarity and the annotation …