A comprehensive survey of scene graphs: Generation and application
Scene graph is a structured representation of a scene that can clearly express the objects,
attributes, and relationships between objects in the scene. As computer vision technology …
attributes, and relationships between objects in the scene. As computer vision technology …
Graph representation learning meets computer vision: A survey
A graph structure is a powerful mathematical abstraction, which can not only represent
information about individuals but also capture the interactions between individuals for …
information about individuals but also capture the interactions between individuals for …
[HTML][HTML] Cpt: Colorful prompt tuning for pre-trained vision-language models
Abstract Vision-Language Pre-training (VLP) models have shown promising capabilities in
grounding natural language in image data, facilitating a broad range of cross-modal tasks …
grounding natural language in image data, facilitating a broad range of cross-modal tasks …
Panoptic scene graph generation
Existing research addresses scene graph generation (SGG)—a critical technology for scene
understanding in images—from a detection perspective, ie., objects are detected using …
understanding in images—from a detection perspective, ie., objects are detected using …
Unbiased scene graph generation from biased training
Today's scene graph generation (SGG) task is still far from practical, mainly due to the
severe training bias, eg, collapsing diverse" human walk on/sit on/lay on beach" into" human …
severe training bias, eg, collapsing diverse" human walk on/sit on/lay on beach" into" human …
Graphadapter: Tuning vision-language models with dual knowledge graph
Adapter-style efficient transfer learning (ETL) has shown excellent performance in the tuning
of vision-language models (VLMs) under the low-data regime, where only a few additional …
of vision-language models (VLMs) under the low-data regime, where only a few additional …
Bipartite graph network with adaptive message passing for unbiased scene graph generation
Scene graph generation is an important visual understanding task with a broad range of
vision applications. Despite recent tremendous progress, it remains challenging due to the …
vision applications. Despite recent tremendous progress, it remains challenging due to the …
Mukea: Multimodal knowledge extraction and accumulation for knowledge-based visual question answering
Abstract Knowledge-based visual question answering requires the ability of associating
external knowledge for open-ended cross-modal scene understanding. One limitation of …
external knowledge for open-ended cross-modal scene understanding. One limitation of …
The devil is in the labels: Noisy label correction for robust scene graph generation
Unbiased SGG has achieved significant progress over recent years. However, almost all
existing SGG models have overlooked the ground-truth annotation qualities of prevailing …
existing SGG models have overlooked the ground-truth annotation qualities of prevailing …