Diffusion models: A comprehensive survey of methods and applications

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org
Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Zero-1-to-3: Zero-shot one image to 3d object

R Liu, R Wu, B Van Hoorick… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an
object given just a single RGB image. To perform novel view synthesis in this …

Imagen video: High definition video generation with diffusion models

J Ho, W Chan, C Saharia, J Whang, R Gao… - arXiv preprint arXiv …, 2022 - arxiv.org
We present Imagen Video, a text-conditional video generation system based on a cascade
of video diffusion models. Given a text prompt, Imagen Video generates high definition …

Lion: Latent point diffusion models for 3d shape generation

A Vahdat, F Williams, Z Gojcic… - Advances in …, 2022 - proceedings.neurips.cc
Denoising diffusion models (DDMs) have shown promising results in 3D point cloud
synthesis. To advance 3D DDMs and make them useful for digital artists, we require (i) high …

Consistency models

Y Song, P Dhariwal, M Chen, I Sutskever - arXiv preprint arXiv:2303.01469, 2023 - arxiv.org
Diffusion models have significantly advanced the fields of image, audio, and video
generation, but they depend on an iterative sampling process that causes slow generation …

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

C Lu, Y Zhou, F Bao, J Chen, C Li… - Advances in Neural …, 2022 - proceedings.neurips.cc
Diffusion probabilistic models (DPMs) are emerging powerful generative models. Despite
their high-quality generation performance, DPMs still suffer from their slow sampling as they …

Audiolm: a language modeling approach to audio generation

Z Borsos, R Marinier, D Vincent… - … ACM transactions on …, 2023 - ieeexplore.ieee.org
We introduce AudioLM, a framework for high-quality audio generation with long-term
consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts …

Diffusion-lm improves controllable text generation

X Li, J Thickstun, I Gulrajani… - Advances in Neural …, 2022 - proceedings.neurips.cc
Controlling the behavior of language models (LMs) without re-training is a major open
problem in natural language generation. While recent works have demonstrated successes …

Compositional visual generation with composable diffusion models

N Liu, S Li, Y Du, A Torralba, JB Tenenbaum - European Conference on …, 2022 - Springer
Large text-guided diffusion models, such as DALLE-2, are able to generate stunning
photorealistic images given natural language descriptions. While such models are highly …