Multimodal image synthesis and editing: A survey and taxonomy

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

Adventures in data analysis: A systematic review of Deep Learning techniques for pattern recognition in cyber-physical-social systems

Z Amiri, A Heidari, NJ Navimipour, M Unal… - Multimedia Tools and …, 2024 - Springer
Abstract Machine Learning (ML) and Deep Learning (DL) have achieved high success in
many textual, auditory, medical imaging, and visual recognition patterns. Concerning the …

Tryondiffusion: A tale of two unets

L Zhu, D Yang, T Zhu, F Reda… - Proceedings of the …, 2023 - openaccess.thecvf.com
Given two images depicting a person and a garment worn by another person, our goal is to
generate a visualization of how the garment might look on the input person. A key challenge …

Animatable neural radiance fields for modeling dynamic human bodies

S Peng, J Dong, Q Wang, S Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper addresses the challenge of reconstructing an animatable human model from a
multi-view video. Some recent works have proposed to decompose a non-rigidly deforming …

Text2human: Text-driven controllable human image generation

Y Jiang, S Yang, H Qiu, W Wu, CC Loy… - ACM Transactions on …, 2022 - dl.acm.org
Generating high-quality and diverse human images is an important yet challenging task in
vision and graphics. However, existing generative models often fall short under the high …

Gauhuman: Articulated gaussian splatting from monocular human videos

S Hu, T Hu, Z Liu - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We present GauHuman a 3D human model with Gaussian Splatting for both fast training (1 2
minutes) and real-time rendering (up to 189 FPS) compared with existing NeRF-based …

Person image synthesis via denoising diffusion model

AK Bhunia, S Khan, H Cholakkal… - Proceedings of the …, 2023 - openaccess.thecvf.com
The pose-guided person image generation task requires synthesizing photorealistic images
of humans in arbitrary poses. The existing approaches use generative adversarial networks …

Unbalanced feature transport for exemplar-based image translation

F Zhan, Y Yu, K Cui, G Zhang, S Lu… - Proceedings of the …, 2021 - openaccess.thecvf.com
Despite the great success of GANs in images translation with different conditioned inputs
such as semantic segmentation and edge map, generating high-fidelity images with …

Human-art: A versatile human-centric dataset bridging natural and artificial scenes

X Ju, A Zeng, J Wang, Q Xu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Humans have long been recorded in a variety of forms since antiquity. For example,
sculptures and paintings were the primary media for depicting human beings before the …

Humansd: A native skeleton-guided diffusion model for human image generation

X Ju, A Zeng, C Zhao, J Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Controllable human image generation (HIG) has attracted significant attention from
academia and industry for its numerous real-life applications. State-of-the-art solutions, such …