Wouaf: Weight modulation for user attribution and fingerprinting in text-to-image diffusion models
The rapid advancement of generative models facilitating the creation of hyper-realistic
images from textual descriptions has concurrently escalated critical societal concerns such …
images from textual descriptions has concurrently escalated critical societal concerns such …
Eclipse: A resource-efficient text-to-image prior for image generations
Abstract Text-to-image (T2I) diffusion models notably the unCLIP models (eg DALL-E-2)
achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks at …
achieve state-of-the-art (SOTA) performance on various compositional T2I benchmarks at …
Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model
Abstract Text-to-image (T2I) generative models have recently emerged as a powerful tool
enabling the creation of photo-realistic images and giving rise to a multitude of applications …
enabling the creation of photo-realistic images and giving rise to a multitude of applications …
A survey on knowledge-enhanced multimodal learning
M Lymperaiou, G Stamou - Artificial Intelligence Review, 2024 - Springer
Multimodal learning has been a field of increasing interest, aiming to combine various
modalities in a single joint representation. Especially in the area of visiolinguistic (VL) …
modalities in a single joint representation. Especially in the area of visiolinguistic (VL) …
-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
Despite the recent advances in personalized text-to-image (P-T2I) generative models,
subject-driven T2I remains challenging. The primary bottlenecks include 1) Intensive training …
subject-driven T2I remains challenging. The primary bottlenecks include 1) Intensive training …
Conceptmix: A compositional image generation benchmark with controllable difficulty
Compositionality is a critical capability in Text-to-Image (T2I) models, as it reflects their ability
to understand and combine multiple concepts from text descriptions. Existing evaluations of …
to understand and combine multiple concepts from text descriptions. Existing evaluations of …
CUPID: Contextual Understanding of Prompt‐conditioned Image Distributions
Y Zhao, M Li, M Berger - Computer Graphics Forum, 2024 - Wiley Online Library
We present CUPID: a visualization method for the contextual understanding of prompt‐
conditioned image distributions. CUPID targets the visual analysis of distributions produced …
conditioned image distributions. CUPID targets the visual analysis of distributions produced …
Towards robust visual understanding: A paradigm shift in computer vision from recognition to reasoning
T Gokhale - AI Magazine, 2024 - Wiley Online Library
Abstract Models that learn from data are widely and rapidly being deployed today for real‐
world use, but they suffer from unforeseen failures that limit their reliability. These failures …
world use, but they suffer from unforeseen failures that limit their reliability. These failures …
Multimodal Content Generation
In this chapter, we will review the advances that are being made in this new field of
multimodal content generation and also discuss several challenges associated with this …
multimodal content generation and also discuss several challenges associated with this …
Strengthening Image Generative AI: Integrating Fingerprinting and Revision Methods for Enhanced Safety and Control
C Kim - 2024 - search.proquest.com
In the rapidly evolving field of Generative Artificial Intelligence (Gen-AI) for imaging, models
such as DALL· E3 and Stable Diffusion have transitioned from theoretical concepts to …
such as DALL· E3 and Stable Diffusion have transitioned from theoretical concepts to …