Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models

S Bond-Taylor, A Leach, Y Long… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Deep generative models are a class of techniques that train deep neural networks to model
the distribution of training samples. Research has fragmented into various interconnected …

A review on generative adversarial networks: Algorithms, theory, and applications

J Gui, Z Sun, Y Wen, D Tao, J Ye - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …

Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis

A Sauer, T Karras, S Laine… - … on machine learning, 2023 - proceedings.mlr.press
Text-to-image synthesis has recently seen significant progress thanks to large pretrained
language models, large-scale training data, and the introduction of scalable model families …

High-fidelity audio compression with improved rvqgan

R Kumar, P Seetharaman, A Luebs… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Language models have been successfully used to model natural signals, such as
images, speech, and music. A key component of these models is a high quality neural …

Adversarial diffusion distillation

A Sauer, D Lorenz, A Blattmann… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that
efficiently samples large-scale foundational image diffusion models in just 1-4 steps while …

Styleswin: Transformer-based gan for high-resolution image generation

B Zhang, S Gu, B Zhang, J Bao… - Proceedings of the …, 2022 - openaccess.thecvf.com
Despite the tantalizing success in a broad of vision tasks, transformers have not yet
demonstrated on-par ability as ConvNets in high-resolution image generative modeling. In …

A survey on neural speech synthesis

X Tan, T Qin, F Soong, TY Liu - arXiv preprint arXiv:2106.15561, 2021 - arxiv.org
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …

Sinnerf: Training neural radiance fields on complex scenes from a single image

D Xu, Y Jiang, P Wang, Z Fan, H Shi… - European Conference on …, 2022 - Springer
Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense
covers largely prohibits its wider applications. While several recent works have attempted to …

Cross-modal contrastive learning for text-to-image generation

H Zhang, JY Koh, J Baldridge… - Proceedings of the …, 2021 - openaccess.thecvf.com
The output of text-to-image synthesis systems should be coherent, clear, photo-realistic
scenes with high semantic fidelity to their conditioned text descriptions. Our Cross-Modal …

Pd-gan: Probabilistic diverse gan for image inpainting

H Liu, Z Wan, W Huang, Y Song… - Proceedings of the …, 2021 - openaccess.thecvf.com
We propose PD-GAN, a probabilistic diverse GAN forimage inpainting. Given an input image
with arbitrary holeregions, PD-GAN produces multiple inpainting results withdiverse and …