Towards overcoming false positives in visual relationship detection

G Zhu, L Zhang, Y Jiang, Y Dang, H Hou… - arXiv preprint arXiv …, 2022 - arxiv.org

Deep learning techniques have led to remarkable breakthroughs in the field of generic
object detection and have spawned a lot of scene-understanding tasks in recent years …

被引用次数：65 相关文章所有 2 个版本

[HTML] sciencedirect.com

[HTML][HTML] Scene graph generation: A comprehensive survey

H Li, G Zhu, L Zhang, Y Jiang, Y Dang, H Hou, P Shen… - Neurocomputing, 2024 - Elsevier

Deep learning techniques have led to remarkable breakthroughs in the field of object
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …

被引用次数：15 相关文章所有 4 个版本

[PDF] thecvf.com

What to look at and where: Semantic and spatial refined transformer for detecting human-object interactions

ASM Iftekhar, H Chen, K Kundu, X Li… - Proceedings of the …, 2022 - openaccess.thecvf.com

We propose a novel one-stage Transformer-based semantic and spatial refined transformer
(SSRT) to solve the Human-Object Interaction detection task, which requires to localize …

被引用次数：48 相关文章所有 7 个版本

Webly supervised knowledge-embedded model for visual reasoning

W Zheng, L Yan, W Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Visual reasoning between visual images and natural language remains a long-standing
challenge in computer vision. Conventional deep supervision methods target at finding …

被引用次数：5 相关文章所有 3 个版本

Sgpt: The secondary path guides the primary path in transformers for hoi detection

S Chan, W Wang, Z Shao, C Bai - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

HOI detection is essential for human-computer interaction, especially in behavior detection
and robot manipulation. Existing mainstream transformer methods of HOI detection are …

被引用次数：4 相关文章

Knowledge-Embedded Mutual Guidance for Visual Reasoning

W Zheng, L Yan, L Chen, Q Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Visual reasoning between visual images and natural language is a long-standing challenge
in computer vision. Most of the methods aim to look for answers to questions only on the …

被引用次数：1 相关文章所有 3 个版本

[PDF] wiley.com Full View

A symmetric fusion learning model for detecting visual relations and scene parsing

X Liu, X Jing, Z Zheng, W Du, X Ding… - Scientific …, 2022 - Wiley Online Library

Visual relationship detection (VRD) aims to locate objects and recognize their pairwise
relationships for parsing scene graphs. To enable a higher understanding of the visual …

被引用次数：1 相关文章所有 5 个版本