Cross-domain facial expression recognition: A unified evaluation benchmark and adversarial...

A review of multimodal emotion recognition from datasets, preprocessing, features, and fusion methods

B Pan, K Hirota, Z Jia, Y Dai - Neurocomputing, 2023 - Elsevier

Affective computing is one of the most important research fields in modern human–computer
interaction (HCI). The goal of affective computing is to study and develop the theories …

被引用次数：17 相关文章所有 2 个版本

[PDF] aaai.org

Transzero: Attribute-guided transformer for zero-shot learning

S Chen, Z Hong, Y Liu, GS Xie, B Sun, H Li… - Proceedings of the …, 2022 - ojs.aaai.org

Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic
knowledge from seen classes to unseen ones. Semantic knowledge is learned from attribute …

被引用次数：112 相关文章所有 12 个版本

[PDF] aaai.org

Heterogeneous semantic transfer for multi-label recognition with partial labels

T Chen, T Pu, L Liu, Y Shi, Z Yang, L Lin - International Journal of …, 2024 - Springer

Multi-label image recognition with partial labels (MLR-PL), in which some labels are known
while others are unknown for each image, may greatly reduce the cost of annotation and …

被引用次数：50 相关文章所有 9 个版本

[PDF] aaai.org

Semantic-aware representation blending for multi-label image recognition with partial labels

T Pu, T Chen, H Wu, L Lin - Proceedings of the AAAI conference on …, 2022 - ojs.aaai.org

Training the multi-label image recognition models with partial labels, in which merely some
labels are known while others are unknown for each image, is a considerably challenging …

被引用次数：42 相关文章所有 5 个版本

[PDF] aaai.org

Magic: Multimodal relational graph adversarial inference for diverse and unpaired text-based image captioning

W Zhang, H Shi, J Guo, S Zhang, Q Cai, J Li… - Proceedings of the …, 2022 - ojs.aaai.org

Text-based image captioning (TextCap) requires simultaneous comprehension of visual
content and reading the text of images to generate a natural language description. Although …

被引用次数：33 相关文章所有 7 个版本

[PDF] thecvf.com

Understanding self-attention mechanism via dynamical system perspective

Z Huang, M Liang, J Qin, S Zhong… - Proceedings of the …, 2023 - openaccess.thecvf.com

The self-attention mechanism (SAM) is widely used in various fields of artificial intelligence
and has successfully boosted the performance of different models. However, current …

被引用次数：5 相关文章所有 5 个版本

FG-AGR: Fine-grained associative graph representation for facial expression recognition in the wild

C Li, X Li, X Wang, D Huang, Z Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Facial expression recognition (FER) in the wild is challenging due to various unconstrained
conditions, ie, occlusions and head pose variations. Previous methods tend to improve the …

被引用次数：21 相关文章所有 2 个版本

[PDF] arxiv.org

Spatial-temporal knowledge-embedded transformer for video scene graph generation

T Pu, T Chen, H Wu, Y Lu, L Lin - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org

Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer
their relationships for a given video. It requires not only a comprehensive understanding of …

被引用次数：8 相关文章所有 7 个版本

[PDF] arxiv.org

Multi-stage spatio-temporal aggregation transformer for video person re-identification

Z Tang, R Zhang, Z Peng, J Chen… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

In recent years, the Transformer architecture has shown its superiority in the video-based
person re-identification task. Inspired by video representation learning, these methods …

被引用次数：16 相关文章所有 4 个版本

[PDF] arxiv.org

RestoreFormer++: Towards real-world blind face restoration from undegraded key-value pairs

Z Wang, J Zhang, T Chen, W Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Blind face restoration aims at recovering high-quality face images from those with unknown
degradations. Current algorithms mainly introduce priors to complement high-quality details …

被引用次数：6 相关文章所有 8 个版本