Ego-exo4d: Understanding skilled human activity from first-and third-person perspectives

K Grauman, A Westbury, L Torresani… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract We present Ego-Exo4D a diverse large-scale multimodal multiview video dataset
and benchmark challenge. Ego-Exo4D centers around simultaneously-captured egocentric …

Tokencut: Segmenting objects in images and videos with self-supervised transformer and normalized cut

Y Wang, X Shen, Y Yuan, Y Du, M Li… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
In this paper, we describe a graph-based algorithm that uses the features obtained by a self-
supervised transformer to detect and segment salient objects in images and videos. With this …

Self-supervised transformers for unsupervised object discovery using normalized cut

Y Wang, X Shen, SX Hu, Y Yuan… - Proceedings of the …, 2022 - openaccess.thecvf.com
Transformers trained with self-supervision using self-distillation loss (DINO) have been
shown to produce attention maps that highlight salient foreground objects. In this paper, we …

Self-supervised co-salient object detection via feature correspondences at multiple scales

S Chakraborty, D Samaras - European Conference on Computer Vision, 2025 - Springer
Our paper introduces a novel two-stage self-supervised approach for detecting co-occurring
salient objects (CoSOD) in image groups without requiring segmentation annotations …

Unsupervised Object Discovery: A Comprehensive Survey and Unified Taxonomy

JF Villa-Vásquez, M Pedersoli - arXiv preprint arXiv:2411.00868, 2024 - arxiv.org
Unsupervised object discovery is commonly interpreted as the task of localizing and/or
categorizing objects in visual data without the need for labeled examples. While current …

Diffusion Models as Data Mining Tools

I Siglidis, A Holynski, AA Efros, M Aubry… - arXiv preprint arXiv …, 2024 - arxiv.org
This paper demonstrates how to use generative models trained for image synthesis as tools
for visual data mining. Our insight is that since contemporary generative models learn an …

Unsupervised and semi-supervised co-salient object detection via segmentation frequency statistics

S Chakraborty, S Naha, M Bastan… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper, we address the detection of co-occurring salient objects (CoSOD) in an image
group using frequency statistics in an unsupervised manner, which further enable us to …

AMFusionNet: A Novel Framework for Enhanced Infrared and Visible Image Fusion

Q Xu, Y Zheng - IEEE Access, 2024 - ieeexplore.ieee.org
Infrared and visible image fusion (IVIF) aims to synthesize images that capitalize on the
strengths of both modalities. Addressing the common challenge in IVIF of preserving thermal …

LCCo: Lending CLIP to co-segmentation

X Duan, Y Yang, L Pan, X Liu - Pattern Recognition, 2024 - Elsevier
This paper studies co-segmenting common semantic objects in a set of images. Existing
works either rely on carefully engineered networks to mine implicit semantics in visual …

Fusion of Infrared and Visible Images based on Spatial-Channel Attentional Mechanism

Q Xu - arXiv preprint arXiv:2308.13672, 2023 - arxiv.org
In the study, we present AMFusionNet, an innovative approach to infrared and visible image
fusion (IVIF), harnessing the power of multiple kernel sizes and attention mechanisms. By …