Sscr: Iterative language-based image editing via self-supervised counterfactual reasoning

W Feng, W Zhu, T Fu, V Jampani… - Advances in …, 2024 - proceedings.neurips.cc

Attaining a high degree of user controllability in visual generation often requires intricate,
fine-grained inputs like layouts. However, such inputs impose a substantial burden on users …

被引用次数：118 相关文章所有 7 个版本

[PDF] neurips.cc

Magicbrush: A manually annotated dataset for instruction-guided image editing

K Zhang, L Mo, W Chen, H Sun… - Advances in Neural …, 2024 - proceedings.neurips.cc

Text-guided image editing is widely needed in daily life, ranging from personal use to
professional applications such as Photoshop. However, existing methods are either zero …

被引用次数：111 相关文章所有 6 个版本

[PDF] arxiv.org

Training-free structured diffusion guidance for compositional text-to-image synthesis

W Feng, X He, TJ Fu, V Jampani, A Akula… - arXiv preprint arXiv …, 2022 - arxiv.org

Large-scale diffusion models have achieved state-of-the-art results on text-to-image
synthesis (T2I) tasks. Despite their ability to generate high-quality yet creative images, we …

被引用次数：242 相关文章所有 6 个版本

[PDF] thecvf.com

Counterfactual vqa: A cause-effect look at language bias

Y Niu, K Tang, H Zhang, Z Lu… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recent VQA models may tend to rely on language bias as a shortcut and thus fail to
sufficiently learn the multi-modal knowledge from both vision and language. In this paper …

被引用次数：425 相关文章所有 7 个版本

[PDF] thecvf.com

Talk-to-edit: Fine-grained facial editing via dialog

Y Jiang, Z Huang, X Pan, CC Loy… - Proceedings of the …, 2021 - openaccess.thecvf.com

Facial editing is an important task in vision and graphics with numerous applications.
However, existing works are incapable to deliver a continuous and fine-grained editing …

被引用次数：118 相关文章所有 7 个版本

[PDF] arxiv.org

Guiding instruction-based image editing via multimodal large language models

TJ Fu, W Hu, X Du, WY Wang, Y Yang… - arXiv preprint arXiv …, 2023 - arxiv.org

Instruction-based image editing improves the controllability and flexibility of image
manipulation via natural commands without elaborate descriptions or regional masks …

被引用次数：44 相关文章所有 5 个版本

[PDF] thecvf.com

Tell me what happened: Unifying text-guided video completion via multimodal masked video generation

TJ Fu, L Yu, N Zhang, CY Fu, JC Su… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generating a video given the first several static frames is challenging as it anticipates
reasonable future frames with temporal coherence. Besides video prediction, the ability to …

被引用次数：33 相关文章所有 8 个版本

[PDF] ucsb.edu

Language-driven artistic style transfer

TJ Fu, XE Wang, WY Wang - European Conference on Computer Vision, 2022 - Springer

Despite having promising results, style transfer, which requires preparing style images in
advance, may result in lack of creativity and accessibility. Following human instruction, on …

被引用次数：40 相关文章所有 3 个版本

Talk-to-edit: Fine-grained 2d and 3d facial editing via dialog

Y Jiang, Z Huang, T Wu, X Pan… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Facial editing is to manipulate the facial attributes of a given face image. Nowadays, with the
development of generative models, users can easily generate 2D and 3D facial images with …

被引用次数：4 相关文章所有 6 个版本

[PDF] thecvf.com

Iterative multi-granular image editing using diffusion models

KJ Joseph, P Udhayanan, T Shukla… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent advances in text-guided image synthesis has dramatically changed how creative
professionals generate artistic and aesthetically pleasing visual assets. To fully support such …

被引用次数：6 相关文章所有 6 个版本