Scene graph generation: A comprehensive survey
Deep learning techniques have led to remarkable breakthroughs in the field of generic
object detection and have spawned a lot of scene-understanding tasks in recent years …
object detection and have spawned a lot of scene-understanding tasks in recent years …
[HTML][HTML] Scene graph generation: A comprehensive survey
Deep learning techniques have led to remarkable breakthroughs in the field of object
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …
What to look at and where: Semantic and spatial refined transformer for detecting human-object interactions
We propose a novel one-stage Transformer-based semantic and spatial refined transformer
(SSRT) to solve the Human-Object Interaction detection task, which requires to localize …
(SSRT) to solve the Human-Object Interaction detection task, which requires to localize …
Webly supervised knowledge-embedded model for visual reasoning
W Zheng, L Yan, W Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Visual reasoning between visual images and natural language remains a long-standing
challenge in computer vision. Conventional deep supervision methods target at finding …
challenge in computer vision. Conventional deep supervision methods target at finding …
Sgpt: The secondary path guides the primary path in transformers for hoi detection
HOI detection is essential for human-computer interaction, especially in behavior detection
and robot manipulation. Existing mainstream transformer methods of HOI detection are …
and robot manipulation. Existing mainstream transformer methods of HOI detection are …
Knowledge-Embedded Mutual Guidance for Visual Reasoning
W Zheng, L Yan, L Chen, Q Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Visual reasoning between visual images and natural language is a long-standing challenge
in computer vision. Most of the methods aim to look for answers to questions only on the …
in computer vision. Most of the methods aim to look for answers to questions only on the …
A symmetric fusion learning model for detecting visual relations and scene parsing
X Liu, X Jing, Z Zheng, W Du, X Ding… - Scientific …, 2022 - Wiley Online Library
Visual relationship detection (VRD) aims to locate objects and recognize their pairwise
relationships for parsing scene graphs. To enable a higher understanding of the visual …
relationships for parsing scene graphs. To enable a higher understanding of the visual …