A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …
Constructing maps for autonomous robotics: An introductory conceptual overview
Mapping the environment is a powerful technique for enabling autonomy through
localization and planning in robotics. This article seeks to provide a global overview of …
localization and planning in robotics. This article seeks to provide a global overview of …
Foundations of spatial perception for robotics: Hierarchical representations and real-time systems
3D spatial perception is the problem of building and maintaining an actionable and
persistent representation of the environment in real-time using sensor data and prior …
persistent representation of the environment in real-time using sensor data and prior …
Pair then relation: Pair-net for panoptic scene graph generation
Panoptic Scene Graph (PSG) is a challenging task in Scene Graph Generation (SGG) that
aims to create a more comprehensive scene graph representation using panoptic …
aims to create a more comprehensive scene graph representation using panoptic …
Hilo: Exploiting high low frequency relations for unbiased panoptic scene graph generation
Abstract Panoptic Scene Graph generation (PSG) is a recently proposed task in image
scene understanding that aims to segment the image and extract triplets of subjects, objects …
scene understanding that aims to segment the image and extract triplets of subjects, objects …
Emergent visual-semantic hierarchies in image-text representations
M Alper, H Averbuch-Elor - European Conference on Computer Vision, 2025 - Springer
While recent vision-and-language models (VLMs) like CLIP are a powerful tool for analyzing
text and images in a shared semantic space, they do not explicitly model the hierarchical …
text and images in a shared semantic space, they do not explicitly model the hierarchical …
Learning situation hyper-graphs for video question answering
Answering questions about complex situations in videos requires not only capturing of the
presence of actors, objects, and their relations, but also the evolution of these relationships …
presence of actors, objects, and their relations, but also the evolution of these relationships …
Synthesizing event-centric knowledge graphs of daily activities using virtual space
Artificial intelligence (AI) is expected to be embodied in software agents, robots, and cyber-
physical systems that can understand the various contextual information of daily life in the …
physical systems that can understand the various contextual information of daily life in the …
More knowledge, less bias: Unbiasing scene graph generation with explicit ontological adjustment
Scene graph generation (SGG) models seek to detect relationships between objects in a
given image. One challenge in this area is the biased distribution of predicates in the dataset …
given image. One challenge in this area is the biased distribution of predicates in the dataset …
Reefknot: A comprehensive benchmark for relation hallucination evaluation, analysis and mitigation in multimodal large language models
Hallucination issues persistently plagued current multimodal large language models
(MLLMs). While existing research primarily focuses on object-level or attribute-level …
(MLLMs). While existing research primarily focuses on object-level or attribute-level …