Imagen editor and editbench: Advancing and evaluating text-guided image inpainting

S Wang, C Saharia, C Montgomery… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-guided image editing can have a transformative impact in supporting creative
applications. A key challenge is to generate edits that are faithful to the input text prompt …

Musechat: A conversational music recommendation system for videos

Z Dong, X Liu, B Chen, P Polak… - Proceedings of the …, 2024 - openaccess.thecvf.com
Music recommendation for videos attracts growing interest in multi-modal research.
However existing systems focus primarily on content compatibility often ignoring the users' …

Tuning-free inversion-enhanced control for consistent image editing

X Duan, S Cui, G Kang, B Zhang, Z Fei, M Fan… - Proceedings of the …, 2024 - ojs.aaai.org
Consistent editing of real images is a challenging task, as it requires performing non-rigid
edits (eg, changing postures) to the main objects in the input image without changing their …

Coralstyleclip: Co-optimized region and layer selection for image editing

A Revanur, D Basu, S Agrawal… - Proceedings of the …, 2023 - openaccess.thecvf.com
Edit fidelity is a significant issue in open-world controllable generative image editing.
Recently, CLIP-based approaches have traded off simplicity to alleviate these problems by …

Negative Pre-aware for Noisy Cross-Modal Matching

X Zhang, H Li, M Ye - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Cross-modal noise-robust learning is a challenging task since noisy correspondence is hard
to recognize and rectify. Due to the cumulative and unavoidable negative impact of …

Towards language-guided interactive 3d generation: Llms as layout interpreter with generative feedback

Y Lin, H Wu, R Wang, H Lu, X Lin, H Xiong… - arXiv preprint arXiv …, 2023 - arxiv.org
Generating and editing a 3D scene guided by natural language poses a challenge, primarily
due to the complexity of specifying the positional relations and volumetric changes within the …

Instilling Multi-round Thinking to Text-guided Image Generation

L Zeng, Z Zheng, Y Wei, T Chua - arXiv preprint arXiv:2401.08472, 2024 - arxiv.org
In this paper, we study the text-guided image generation task. Our focus lies in the
modification of a reference image, given user text feedback, to imbue it with specific desired …

Chatedit: Towards multi-turn interactive facial image editing via dialogue

X Cui, Z Li, P Li, Y Hu, H Shi, Z He - arXiv preprint arXiv:2303.11108, 2023 - arxiv.org
This paper explores interactive facial image editing via dialogue and introduces the ChatEdit
benchmark dataset for evaluating image editing and conversation abilities in this context …

Tell your story: task-oriented dialogs for interactive content creation

S Kottur, S Moon, AH Markosyan, H Shah… - arXiv preprint arXiv …, 2022 - arxiv.org
People capture photos and videos to relive and share memories of personal significance.
Recently, media montages (stories) have become a popular mode of sharing these …

PColorizor: Re-coloring Ancient Chinese Paintings with Ideorealm-congruent Poems

T Tang, Y Wu, P Xia, W Wu, X Wang, Y Wu - Proceedings of the 36th …, 2023 - dl.acm.org
Color restoration of ancient Chinese paintings plays a significant role in Chinese culture
protection and inheritance. However, traditional color restoration is challenging and time …