Prompting for Discovery: Flexible Sense-Making for AI Art-Making with Dreamsheets

SG Almeda, JD Zamfirescu-Pereira, KW Kim… - Proceedings of the CHI …, 2024 - dl.acm.org
Design space exploration (DSE) for Text-to-Image (TTI) models entails navigating a vast,
opaque space of possible image outputs, through a commensurately vast input space of …

CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets

L Zhang, Z Wang, Q Zhang, Q Qiu, A Pang… - ACM Transactions on …, 2024 - dl.acm.org
In the realm of digital creativity, our potential to craft intricate 3D worlds from imagination is
often hampered by the limitations of existing digital tools, which demand extensive expertise …

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models

S Motamed, W Van Gansbeke… - Proceedings of the …, 2024 - openaccess.thecvf.com
With recent advances in image and video diffusion models for content creation a plethora of
techniques have been proposed for customizing their generated content. In particular …

Generative artificial intelligence and building design: early photorealistic render visualization of façades using local identity-trained models

H Jo, JK Lee, YC Lee, S Choo - Journal of Computational …, 2024 - academic.oup.com
This paper elucidates an approach that utilizes generative artificial intelligence (AI) to
develop alternative architectural design options based on local identity. The advancement of …

Animatediff-lightning: Cross-model diffusion distillation

S Lin, X Yang - arXiv preprint arXiv:2403.12706, 2024 - arxiv.org
We present AnimateDiff-Lightning for lightning-fast video generation. Our model uses
progressive adversarial diffusion distillation to achieve new state-of-the-art in few-step video …

DesignPrompt: Using Multimodal Interaction for Design Exploration with Generative AI

X Peng, J Koch, WE Mackay - Proceedings of the 2024 ACM Designing …, 2024 - dl.acm.org
Visually oriented designers often struggle to create effective generative AI (GenAI) prompts.
A preliminary study identified specific issues in composing and fine-tuning prompts, as well …

Enhancing Baidu Multimodal Advertisement with Chinese Text-to-Image Generation via Bilingual Alignment and Caption Synthesis

K Zhao, X Zhao, Z Jin, Y Yang, W Tao, C Han… - Proceedings of the 47th …, 2024 - dl.acm.org
Recent advances in generative artificial intelligence have revolutionized information
retrieval and content generation, opening up new opportunities for the e-commerce industry …

Bridging the Intent Gap: Knowledge-Enhanced Visual Generation

Y Cheng, Z Xu, D Lin, H Cheng, Y Wong, Y Sun… - arXiv preprint arXiv …, 2024 - arxiv.org
For visual content generation, discrepancies between user intentions and the generated
content have been a longstanding problem. This discrepancy arises from two main factors …

Hi5: 2D Hand Pose Estimation with Zero Human Annotation

M Hasan, C Ozel, N Long, A Martin, S Potter… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose a new large synthetic hand pose estimation dataset, Hi5, and a novel
inexpensive method for collecting high-quality synthetic data that requires no human …

InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models

N Saini, N Bodla, A Shrivastava… - arXiv preprint arXiv …, 2024 - arxiv.org
We introduce InVi, an approach for inserting or replacing objects within videos (referred to
as inpainting) using off-the-shelf, text-to-image latent diffusion models. InVi targets controlled …