A review on generative adversarial networks: Algorithms, theory, and applications

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

An overview of voice conversion and its challenges: From statistical modeling to deep learning

B Sisman, J Yamagishi, S King… - IEEE/ACM Transactions …, 2020 - ieeexplore.ieee.org
Speaker identity is one of the important characteristics of human speech. In voice
conversion, we change the speaker identity from one to another, while keeping the linguistic …

Instructpix2pix: Learning to follow image editing instructions

T Brooks, A Holynski, AA Efros - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …

Spatext: Spatio-textual representation for controllable image generation

O Avrahami, T Hayes, O Gafni… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent text-to-image diffusion models are able to generate convincing results of
unprecedented quality. However, it is nearly impossible to control the shapes of different …

Palette: Image-to-image diffusion models

C Saharia, W Chan, H Chang, C Lee, J Ho… - ACM SIGGRAPH 2022 …, 2022 - dl.acm.org
This paper develops a unified framework for image-to-image translation based on
conditional diffusion models and evaluates this framework on four challenging image-to …

Ilvr: Conditioning method for denoising diffusion probabilistic models

J Choi, S Kim, Y Jeong, Y Gwon, S Yoon - arXiv preprint arXiv:2108.02938, 2021 - arxiv.org
Denoising diffusion probabilistic models (DDPM) have shown remarkable performance in
unconditional image generation. However, due to the stochasticity of the generative process …

Gan inversion: A survey

W Xia, Y Zhang, Y Yang, JH Xue… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
GAN inversion aims to invert a given image back into the latent space of a pretrained GAN
model so that the image can be faithfully reconstructed from the inverted code by the …

Contrastive learning for unpaired image-to-image translation

T Park, AA Efros, R Zhang, JY Zhu - … , Glasgow, UK, August 23–28, 2020 …, 2020 - Springer
In image-to-image translation, each patch in the output should reflect the content of the
corresponding patch in the input, independent of domain. We propose a straightforward …

One-shot free-view neural talking-head synthesis for video conferencing

TC Wang, A Mallya, MY Liu - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
We propose a neural talking-head video synthesis model and demonstrate its application to
video conferencing. Our model learns to synthesize a talking-head video using a source …

On aliased resizing and surprising subtleties in gan evaluation

G Parmar, R Zhang, JY Zhu - Proceedings of the IEEE/CVF …, 2022 - openaccess.thecvf.com
Metrics for evaluating generative models aim to measure the discrepancy between real and
generated images. The oftenused Frechet Inception Distance (FID) metric, for example …