A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Constructing maps for autonomous robotics: An introductory conceptual overview

P Racinskis, J Arents, M Greitans - Electronics, 2023 - mdpi.com
Mapping the environment is a powerful technique for enabling autonomy through
localization and planning in robotics. This article seeks to provide a global overview of …

Foundations of spatial perception for robotics: Hierarchical representations and real-time systems

N Hughes, Y Chang, S Hu, R Talak… - … Journal of Robotics …, 2024 - journals.sagepub.com
3D spatial perception is the problem of building and maintaining an actionable and
persistent representation of the environment in real-time using sensor data and prior …

Pair then relation: Pair-net for panoptic scene graph generation

J Wang, Z Wen, X Li, Z Guo, J Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Panoptic Scene Graph (PSG) is a challenging task in Scene Graph Generation (SGG) that
aims to create a more comprehensive scene graph representation using panoptic …

Hilo: Exploiting high low frequency relations for unbiased panoptic scene graph generation

Z Zhou, M Shi, H Caesar - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Abstract Panoptic Scene Graph generation (PSG) is a recently proposed task in image
scene understanding that aims to segment the image and extract triplets of subjects, objects …

Emergent visual-semantic hierarchies in image-text representations

M Alper, H Averbuch-Elor - European Conference on Computer Vision, 2025 - Springer
While recent vision-and-language models (VLMs) like CLIP are a powerful tool for analyzing
text and images in a shared semantic space, they do not explicitly model the hierarchical …

Learning situation hyper-graphs for video question answering

A Urooj, H Kuehne, B Wu, K Chheu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Answering questions about complex situations in videos requires not only capturing of the
presence of actors, objects, and their relations, but also the evolution of these relationships …

Synthesizing event-centric knowledge graphs of daily activities using virtual space

S Egami, T Ugai, M Oono, K Kitamura, K Fukuda - IEEE Access, 2023 - ieeexplore.ieee.org
Artificial intelligence (AI) is expected to be embodied in software agents, robots, and cyber-
physical systems that can understand the various contextual information of daily life in the …

More knowledge, less bias: Unbiasing scene graph generation with explicit ontological adjustment

Z Chen, S Rezayi, S Li - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Scene graph generation (SGG) models seek to detect relationships between objects in a
given image. One challenge in this area is the biased distribution of predicates in the dataset …

Reefknot: A comprehensive benchmark for relation hallucination evaluation, analysis and mitigation in multimodal large language models

K Zheng, J Chen, Y Yan, X Zou, X Hu - arXiv preprint arXiv:2408.09429, 2024 - arxiv.org
Hallucination issues persistently plagued current multimodal large language models
(MLLMs). While existing research primarily focuses on object-level or attribute-level …