Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models
Deep generative models are a class of techniques that train deep neural networks to model
the distribution of training samples. Research has fragmented into various interconnected …
the distribution of training samples. Research has fragmented into various interconnected …
A review on generative adversarial networks: Algorithms, theory, and applications
Generative adversarial networks (GANs) have recently become a hot research topic;
however, they have been studied since 2014, and a large number of algorithms have been …
however, they have been studied since 2014, and a large number of algorithms have been …
Stylegan-t: Unlocking the power of gans for fast large-scale text-to-image synthesis
Text-to-image synthesis has recently seen significant progress thanks to large pretrained
language models, large-scale training data, and the introduction of scalable model families …
language models, large-scale training data, and the introduction of scalable model families …
High-fidelity audio compression with improved rvqgan
Abstract Language models have been successfully used to model natural signals, such as
images, speech, and music. A key component of these models is a high quality neural …
images, speech, and music. A key component of these models is a high quality neural …
Adversarial diffusion distillation
A Sauer, D Lorenz, A Blattmann… - arXiv preprint arXiv …, 2023 - arxiv.org
We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that
efficiently samples large-scale foundational image diffusion models in just 1-4 steps while …
efficiently samples large-scale foundational image diffusion models in just 1-4 steps while …
Styleswin: Transformer-based gan for high-resolution image generation
Despite the tantalizing success in a broad of vision tasks, transformers have not yet
demonstrated on-par ability as ConvNets in high-resolution image generative modeling. In …
demonstrated on-par ability as ConvNets in high-resolution image generative modeling. In …
A survey on neural speech synthesis
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …
speech given text, is a hot research topic in speech, language, and machine learning …
Sinnerf: Training neural radiance fields on complex scenes from a single image
Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense
covers largely prohibits its wider applications. While several recent works have attempted to …
covers largely prohibits its wider applications. While several recent works have attempted to …
Cross-modal contrastive learning for text-to-image generation
The output of text-to-image synthesis systems should be coherent, clear, photo-realistic
scenes with high semantic fidelity to their conditioned text descriptions. Our Cross-Modal …
scenes with high semantic fidelity to their conditioned text descriptions. Our Cross-Modal …
Pd-gan: Probabilistic diverse gan for image inpainting
We propose PD-GAN, a probabilistic diverse GAN forimage inpainting. Given an input image
with arbitrary holeregions, PD-GAN produces multiple inpainting results withdiverse and …
with arbitrary holeregions, PD-GAN produces multiple inpainting results withdiverse and …