Wavegrad: Estimating gradients for waveform generation

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org

Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

被引用次数：998 相关文章所有 6 个版本

[PDF] sciencedirect.com

A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：118 相关文章所有 6 个版本

[PDF] thecvf.com

Zero-1-to-3: Zero-shot one image to 3d object

R Liu, R Wu, B Van Hoorick… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an
object given just a single RGB image. To perform novel view synthesis in this …

被引用次数：519 相关文章所有 6 个版本

[PDF] arxiv.org

Imagen video: High definition video generation with diffusion models

J Ho, W Chan, C Saharia, J Whang, R Gao… - arXiv preprint arXiv …, 2022 - arxiv.org

We present Imagen Video, a text-conditional video generation system based on a cascade
of video diffusion models. Given a text prompt, Imagen Video generates high definition …

被引用次数：989 相关文章所有 4 个版本

[PDF] neurips.cc

Lion: Latent point diffusion models for 3d shape generation

A Vahdat, F Williams, Z Gojcic… - Advances in …, 2022 - proceedings.neurips.cc

Denoising diffusion models (DDMs) have shown promising results in 3D point cloud
synthesis. To advance 3D DDMs and make them useful for digital artists, we require (i) high …

被引用次数：328 相关文章所有 5 个版本

[PDF] arxiv.org

Consistency models

Y Song, P Dhariwal, M Chen, I Sutskever - arXiv preprint arXiv:2303.01469, 2023 - arxiv.org

Diffusion models have significantly advanced the fields of image, audio, and video
generation, but they depend on an iterative sampling process that causes slow generation …

被引用次数：471 相关文章所有 9 个版本

[PDF] neurips.cc

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps

C Lu, Y Zhou, F Bao, J Chen, C Li… - Advances in Neural …, 2022 - proceedings.neurips.cc

Diffusion probabilistic models (DPMs) are emerging powerful generative models. Despite
their high-quality generation performance, DPMs still suffer from their slow sampling as they …

被引用次数：805 相关文章所有 6 个版本

[PDF] arxiv.org

Audiolm: a language modeling approach to audio generation

Z Borsos, R Marinier, D Vincent… - … ACM transactions on …, 2023 - ieeexplore.ieee.org

We introduce AudioLM, a framework for high-quality audio generation with long-term
consistency. AudioLM maps the input audio to a sequence of discrete tokens and casts …

被引用次数：397 相关文章所有 5 个版本

[PDF] neurips.cc

Diffusion-lm improves controllable text generation

X Li, J Thickstun, I Gulrajani… - Advances in Neural …, 2022 - proceedings.neurips.cc

Controlling the behavior of language models (LMs) without re-training is a major open
problem in natural language generation. While recent works have demonstrated successes …

被引用次数：533 相关文章所有 7 个版本

[PDF] arxiv.org

Compositional visual generation with composable diffusion models

N Liu, S Li, Y Du, A Torralba, JB Tenenbaum - European Conference on …, 2022 - Springer

Large text-guided diffusion models, such as DALLE-2, are able to generate stunning
photorealistic images given natural language descriptions. While such models are highly …

被引用次数：330 相关文章所有 7 个版本