Structured sparse r-cnn for direct scene graph generation

Y Cong, MY Yang, B Rosenhahn - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Different objects in the same scene are more or less related to each other, but only a limited
number of these relationships are noteworthy. Inspired by Detection Transformer, which …

被引用次数：156 相关文章所有 10 个版本

[PDF] thecvf.com

Prototype-based embedding network for scene graph generation

C Zheng, X Lyu, L Gao, B Dai… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Abstract Current Scene Graph Generation (SGG) methods explore contextual information to
predict relationships among entity pairs. However, due to the diverse visual appearance of …

被引用次数：55 相关文章所有 8 个版本

[HTML] sciencedirect.com

[HTML][HTML] Scene graph generation: A comprehensive survey

H Li, G Zhu, L Zhang, Y Jiang, Y Dang, H Hou, P Shen… - Neurocomputing, 2024 - Elsevier

Deep learning techniques have led to remarkable breakthroughs in the field of object
detection and have spawned a lot of scene-understanding tasks in recent years. Scene …

被引用次数：30 相关文章所有 4 个版本

[PDF] thecvf.com

Sgtr: End-to-end scene graph generation with transformer

R Li, S Zhang, X He - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com

Abstract Scene Graph Generation (SGG) remains a challenging visual understanding task
due to its compositional property. Most previous works adopt a bottom-up two-stage or a …

被引用次数：111 相关文章所有 13 个版本

[PDF] thecvf.com

Visually-prompted language model for fine-grained scene graph generation in an open world

Q Yu, J Li, Y Wu, S Tang, W Ji… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Scene Graph Generation (SGG) aims to extract< subject, predicate, object>
relationships in images for vision understanding. Although recent works have made steady …

被引用次数：29 相关文章所有 5 个版本

[PDF] arxiv.org

A survey of deep learning for low-shot object detection

Q Huang, H Zhang, M Xue, J Song, M Song - ACM Computing Surveys, 2023 - dl.acm.org

Object detection has achieved a huge breakthrough with deep neural networks and massive
annotated data. However, current detection methods cannot be directly transferred to the …

被引用次数：25 相关文章所有 5 个版本

[PDF] thecvf.com

Multilateral semantic relations modeling for image text retrieval

Z Wang, Z Gao, K Guo, Y Yang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Image-text retrieval is a fundamental task to bridge vision and language by exploiting
various strategies to fine-grained alignment between regions and words. This is still tough …

被引用次数：24 相关文章所有 5 个版本

[PDF] thecvf.com

Egtr: Extracting graph from transformer for scene graph generation

J Im, JY Nam, N Park, H Lee… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Scene Graph Generation (SGG) is a challenging task of detecting objects and
predicting relationships between objects. After DETR was developed one-stage SGG …

被引用次数：13 相关文章所有 3 个版本

[PDF] arxiv.org

Pair then relation: Pair-net for panoptic scene graph generation

J Wang, Z Wen, X Li, Z Guo, J Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Panoptic Scene Graph (PSG) is a challenging task in Scene Graph Generation (SGG) that
aims to create a more comprehensive scene graph representation using panoptic …

被引用次数：15 相关文章所有 3 个版本

[PDF] neurips.cc

Zero-shot visual relation detection via composite visual cues from large language models

L Li, J Xiao, G Chen, J Shao… - Advances in Neural …, 2024 - proceedings.neurips.cc

Pretrained vision-language models, such as CLIP, have demonstrated strong generalization
capabilities, making them promising tools in the realm of zero-shot visual recognition. Visual …

被引用次数：30 相关文章所有 6 个版本