Rethinking the learning paradigm for dynamic facial expression recognition
Abstract Dynamic Facial Expression Recognition (DFER) is a rapidly developing field that
focuses on recognizing facial expressions in video format. Previous research has …
focuses on recognizing facial expressions in video format. Previous research has …
Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model
The ability of large language models (LLMs) to process visual inputs has given rise to
general-purpose vision systems unifying various vision-language (VL) tasks by instruction …
general-purpose vision systems unifying various vision-language (VL) tasks by instruction …
Shape-Consistent One-Shot Unsupervised Domain Adaptation for Rail Surface Defect Segmentation
Deep neural networks have greatly improved the performance of rail surface defect
segmentation when the test samples have the same distribution as the training samples …
segmentation when the test samples have the same distribution as the training samples …
Tcnet: Co-salient object detection via parallel interaction of transformers and cnns
Y Ge, Q Zhang, TZ Xiang, C Zhang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
The purpose of co-salient object detection (CoSOD) is to detect the salient objects that co-
occur in a group of relevant images. CoSOD has been significantly prospered by recent …
occur in a group of relevant images. CoSOD has been significantly prospered by recent …
Attack can benefit: An adversarial approach to recognizing facial expressions under noisy annotations
Abstract The real-world Facial Expression Recognition (FER) datasets usually exhibit
complex scenarios with coupled noise annotations and imbalanced classes distribution …
complex scenarios with coupled noise annotations and imbalanced classes distribution …
Sp-det: Leveraging saliency prediction for voxel-based 3d object detection in sparse point cloud
P An, Y Duan, Y Huang, J Ma, Y Chen… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Voxel is one of the common structural representation of 3D point cloud. Due to the sparsity of
point cloud generated by light detection and ranging (LiDAR), there is the extreme …
point cloud generated by light detection and ranging (LiDAR), there is the extreme …
Zero-shot co-salient object detection framework
Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's
capacity to recognize common and salient objects within a collection of images. Despite …
capacity to recognize common and salient objects within a collection of images. Despite …
Scene Matters: Model-based Deep Video Compression
Video compression has always been a popular research area, where many traditional and
deep video compression methods have been proposed. These methods typically rely on …
deep video compression methods have been proposed. These methods typically rely on …
Multi-View Graph Embedding Learning for Image Co-Segmentation and Co-Localization
Image co-segmentation and co-localization exploit inter-image information to identify and
extract foreground objects with a batch mode. However, they remain challenging when …
extract foreground objects with a batch mode. However, they remain challenging when …
Predicting 360° Video Saliency: A ConvLSTM Encoder-Decoder Network with Spatio-temporal Consistency
360° videos have been widely used with the development of virtual reality technology and
triggered a demand to determine the most visually attractive objects in them, aka 360° video …
triggered a demand to determine the most visually attractive objects in them, aka 360° video …