You'll Never Walk Alone: A Sketch and Text Duet for Fine-Grained Image Retrieval

S Koley, AK Bhunia, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this paper we propose a novel abstraction-aware sketch-based image retrieval framework
capable of handling sketch abstraction at varied levels. Prior works had mainly focused on …

被引用次数：11 相关文章所有 3 个版本

A review on video person re-identification based on deep learning

H Ma, C Zhang, Y Zhang, Z Li, Z Wang, C Wei - Neurocomputing, 2024 - Elsevier

Abstract Person Re-Identification (ReID) is an essential technology for matching a person
across non-overlapping cameras. It has attracted increasing attention in recent years due to …

被引用次数：2 相关文章所有 2 个版本

[PDF] thecvf.com

Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

S Koley, AK Bhunia, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper for the first time explores text-to-image diffusion models for Zero-Shot Sketch-
based Image Retrieval (ZS-SBIR). We highlight a pivotal discovery: the capacity of text-to …

被引用次数：9 相关文章所有 4 个版本

[PDF] thecvf.com

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

S Koley, AK Bhunia, D Sekhri, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com

This paper unravels the potential of sketches for diffusion models addressing the deceptive
promise of direct sketch control in generative AI. We importantly democratise the process …

被引用次数：13 相关文章所有 4 个版本

[PDF] thecvf.com

Democaricature: Democratising caricature generation with a rough sketch

DY Chen, AK Bhunia, S Koley, A Sain… - Proceedings of the …, 2024 - openaccess.thecvf.com

In this paper we democratise caricature generation empowering individuals to effortlessly
craft personalised caricatures with just a photo and a conceptual sketch. Our objective is to …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

A Survey of Multimodal Composite Editing and Retrieval

S Li, F Huang, L Zhang - arXiv preprint arXiv:2409.05405, 2024 - arxiv.org

In the real world, where information is abundant and diverse across different modalities,
understanding and utilizing various data types to improve retrieval systems is a key focus of …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

MMC: Multi-modal colorization of images using textual description

S Ghosh, S Bhattacharya, P Roy, U Pal… - Signal, Image and Video …, 2025 - Springer

Handling various objects with different colours is a significant challenge for image
colourisation techniques. Thus, for complex real-world scenes, the existing image …

被引用次数：1 相关文章所有 4 个版本

[PDF] arxiv.org

Towards Generative Class Prompt Learning for Fine-grained Visual Recognition

S Chattopadhyay, S Biswas, E Vivoli… - arXiv preprint arXiv …, 2024 - arxiv.org

Although foundational vision-language models (VLMs) have proven to be very successful for
various semantic discrimination tasks, they still struggle to perform faithfully for fine-grained …