Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives
Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …
Tokencut: Segmenting objects in images and videos with self-supervised transformer and normalized cut
In this paper, we describe a graph-based algorithm that uses the features obtained by a self-
supervised transformer to detect and segment salient objects in images and videos. With this …
supervised transformer to detect and segment salient objects in images and videos. With this …
Self-supervised transformers for unsupervised object discovery using normalized cut
Transformers trained with self-supervision using self-distillation loss (DINO) have been
shown to produce attention maps that highlight salient foreground objects. In this paper, we …
shown to produce attention maps that highlight salient foreground objects. In this paper, we …
Self-supervised co-salient object detection via feature correspondences at multiple scales
S Chakraborty, D Samaras - European Conference on Computer Vision, 2025 - Springer
Our paper introduces a novel two-stage self-supervised approach for detecting co-occurring
salient objects (CoSOD) in image groups without requiring segmentation annotations …
salient objects (CoSOD) in image groups without requiring segmentation annotations …
Unsupervised Object Discovery: A Comprehensive Survey and Unified Taxonomy
JF Villa-Vásquez, M Pedersoli - arXiv preprint arXiv:2411.00868, 2024 - arxiv.org
Unsupervised object discovery is commonly interpreted as the task of localizing and/or
categorizing objects in visual data without the need for labeled examples. While current …
categorizing objects in visual data without the need for labeled examples. While current …
Diffusion Models as Data Mining Tools
This paper demonstrates how to use generative models trained for image synthesis as tools
for visual data mining. Our insight is that since contemporary generative models learn an …
for visual data mining. Our insight is that since contemporary generative models learn an …
Unsupervised and semi-supervised co-salient object detection via segmentation frequency statistics
In this paper, we address the detection of co-occurring salient objects (CoSOD) in an image
group using frequency statistics in an unsupervised manner, which further enable us to …
group using frequency statistics in an unsupervised manner, which further enable us to …
AMFusionNet: A Novel Framework for Enhanced Infrared and Visible Image Fusion
Q Xu, Y Zheng - IEEE Access, 2024 - ieeexplore.ieee.org
Infrared and visible image fusion (IVIF) aims to synthesize images that capitalize on the
strengths of both modalities. Addressing the common challenge in IVIF of preserving thermal …
strengths of both modalities. Addressing the common challenge in IVIF of preserving thermal …
Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism
Q Xu - arXiv preprint arXiv:2308.13672, 2023 - arxiv.org
In the study, we present AMFusionNet, an innovative approach to infrared and visible image
fusion (IVIF), harnessing the power of multiple kernel sizes and attention mechanisms. By …
fusion (IVIF), harnessing the power of multiple kernel sizes and attention mechanisms. By …