Spatial-temporal knowledge-embedded transformer for video scene graph generation

T Pu, T Chen, H Wu, Y Lu, L Lin - IEEE Transactions on Image …, 2023 - ieeexplore.ieee.org
Video scene graph generation (VidSGG) aims to identify objects in visual scenes and infer
their relationships for a given video. It requires not only a comprehensive understanding of …

Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression Recognition

Y Gao, Y Xie, ZZ Hu, T Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Domain shift poses a significant challenge in Cross-Domain Facial Expression Recognition
(CD-FER) due to the distribution variation between the source and target domains. Current …

Category-Adaptive Label Discovery and Noise Rejection for Multi-label Recognition with Partial Positive Labels

T Pu, Q Lao, H Wu, T Chen, L Tian… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
As a cost-effective alternative to standard multi-label learning, the multi-label image
recognition with partial positive labels (MLR-PPL) task attracts increasing attention, in which …

An attention mechanism network based on the winner-take-all

H Li, S Zhang, D Ma, W Mo - Digital Signal Processing, 2024 - Elsevier
In the field of neurology, neurons compete for brain attention in winner-take-all (WTA)
competitions. Inspired by this, we propose a new WTA-based attention network (called …

DifBFSR: Blind Face Super-Resolution via Conditional Diffusion Contraction

W Yu, Z Li, Q Liu, Y Chen, S Zhang, J Lin - Computing and Informatics, 2024 - cai.sk
Abstract Blind Face Super-Resolution (BFSR) has recently gained widespread attention,
which aims to super-resolve Low-Resolution (LR) face images with complex unknown …

Flair: A conditional diffusion framework with applications to face video restoration

Z Zou, J Liu, S Shoushtari, Y Wang, W Gan… - arXiv preprint arXiv …, 2023 - arxiv.org
Face video restoration (FVR) is a challenging but important problem where one seeks to
recover a perceptually realistic face videos from a low-quality input. While diffusion …