Parts of Speech–Grounded Subspaces in Vision-Language Models

Z Zhao, I Patras - arXiv preprint arXiv:2308.13382, 2023 - arxiv.org

This paper presents a novel visual-language model called DFER-CLIP, which is based on
the CLIP model and designed for in-the-wild Dynamic Facial Expression Recognition …

被引用次数：31 相关文章所有 5 个版本

[PDF] thecvf.com

Improving fairness using vision-language driven image augmentation

M D'Incà, C Tzelepis, I Patras… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Fairness is crucial when training a deep-learning discriminative model, especially in the
facial domain. Models tend to correlate specific characteristics (such as age and skin color) …

被引用次数：13 相关文章所有 8 个版本

[PDF] arxiv.org

MM2Latent: Text-to-facial image generation and editing in GANs with multimodal assistance

D Meng, C Tzelepis, I Patras… - arXiv preprint arXiv …, 2024 - arxiv.org

Generating human portraits is a hot topic in the image generation area, eg mask-to-face
generation and text-to-face generation. However, these unimodal generation methods lack …

Are CLIP features all you need for Universal Synthetic Image Origin Attribution?

D Cioni, C Tzelepis, L Seidenari, I Patras - arXiv preprint arXiv:2408.09153, 2024 - arxiv.org

The steady improvement of Diffusion Models for visual synthesis has given rise to many new
and interesting use cases of synthetic images but also has raised concerns about their …

A Hitchhikers Guide to Fine-Grained Face Forgery Detection Using Common Sense Reasoning

NM Foteinopoulou, E Ghorbel, D Aouada - arXiv preprint arXiv …, 2024 - arxiv.org

Explainability in artificial intelligence is crucial for restoring trust, particularly in areas like
face forgery detection, where viewers often struggle to distinguish between real and …

Understanding the Limitations of Diffusion Concept Algebra Through Food

EZ Zeng, Y Chen, A Wong - arXiv preprint arXiv:2406.03582, 2024 - arxiv.org

Image generation techniques, particularly latent diffusion models, have exploded in
popularity in recent years. Many techniques have been developed to manipulate and clarify …

[PDF] nsf.gov

Exploring CLIP for Real World, Text-based Image Retrieval

M Sultan, L Jacobs, A Stylianou… - 2023 IEEE Applied …, 2023 - ieeexplore.ieee.org

We consider the ability of CLIP features to support text-driven image retrieval. Traditional
image-based queries sometimes misalign with user intentions due to their focus on …

被引用次数：1 相关文章所有 2 个版本