The 2017 davis challenge on video object segmentation

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：78 相关文章所有 3 个版本

[PDF] arxiv.org

Understanding deep learning techniques for image segmentation

S Ghosh, N Das, I Das, U Maulik - ACM computing surveys (CSUR), 2019 - dl.acm.org

The machine learning community has been overwhelmed by a plethora of deep learning--
based approaches. Many challenging computer vision tasks, such as detection, localization …

被引用次数：500 相关文章所有 4 个版本

[PDF] neurips.cc

Segment everything everywhere all at once

X Zou, J Yang, H Zhang, F Li, L Li… - Advances in …, 2024 - proceedings.neurips.cc

In this work, we present SEEM, a promotable and interactive model for segmenting
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …

被引用次数：493 相关文章所有 5 个版本

[PDF] thecvf.com

Structure and content-guided video synthesis with diffusion models

P Esser, J Chiu, P Atighehchian… - Proceedings of the …, 2023 - openaccess.thecvf.com

Text-guided generative diffusion models unlock powerful image creation and editing tools.
Recent approaches that edit the content of footage while retaining structure require …

被引用次数：472 相关文章所有 5 个版本

[PDF] arxiv.org

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arXiv preprint arXiv …, 2024 - arxiv.org

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

被引用次数：312 相关文章所有 2 个版本

[PDF] thecvf.com

Tune-a-video: One-shot tuning of image diffusion models for text-to-video generation

JZ Wu, Y Ge, X Wang, SW Lei, Y Gu… - Proceedings of the …, 2023 - openaccess.thecvf.com

To replicate the success of text-to-image (T2I) generation, recent works employ large-scale
video datasets to train a text-to-video (T2V) generator. Despite their promising results, such …

被引用次数：697 相关文章所有 4 个版本

[PDF] neurips.cc

Emergent correspondence from image diffusion

L Tang, M Jia, Q Wang, CP Phoo… - Advances in Neural …, 2023 - proceedings.neurips.cc

Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …

被引用次数：290 相关文章所有 12 个版本

[PDF] thecvf.com

Fatezero: Fusing attentions for zero-shot text-based video editing

C Qi, X Cun, Y Zhang, C Lei, X Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

The diffusion-based generative models have achieved remarkable success in text-based
image generation. However, since it contains enormous randomness in generation …

被引用次数：295 相关文章所有 6 个版本

[PDF] neurips.cc

Segment anything in high quality

L Ke, M Ye, M Danelljan, YW Tai… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract The recent Segment Anything Model (SAM) represents a big leap in scaling up
segmentation models, allowing for powerful zero-shot capabilities and flexible prompting …

被引用次数：289 相关文章所有 6 个版本

[PDF] thecvf.com

Video-p2p: Video editing with cross-attention control

S Liu, Y Zhang, W Li, Z Lin, J Jia - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Video-P2P is the first framework for real-world video editing with cross-attention control.
While attention control has proven effective for image editing with pre-trained image …

被引用次数：165 相关文章所有 4 个版本