Diffusion-lm improves controllable text generation

X Li, J Thickstun, I Gulrajani… - Advances in Neural …, 2022 - proceedings.neurips.cc
Controlling the behavior of language models (LMs) without re-training is a major open
problem in natural language generation. While recent works have demonstrated successes …

Structured denoising diffusion models in discrete state-spaces

J Austin, DD Johnson, J Ho, D Tarlow… - Advances in …, 2021 - proceedings.neurips.cc
Denoising diffusion probabilistic models (DDPMs)[Ho et al. 2021] have shown impressive
results on image and waveform generation in continuous state spaces. Here, we introduce …

Wavegrad: Estimating gradients for waveform generation

N Chen, Y Zhang, H Zen, RJ Weiss, M Norouzi… - arXiv preprint arXiv …, 2020 - arxiv.org
This paper introduces WaveGrad, a conditional model for waveform generation which
estimates gradients of the data density. The model is built on prior work on score matching …

A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

Diffusionbert: Improving generative masked language models with diffusion models

Z He, T Sun, K Wang, X Huang, X Qiu - arXiv preprint arXiv:2211.15029, 2022 - arxiv.org
We present DiffusionBERT, a new generative masked language model based on discrete
diffusion models. Diffusion models and many pre-trained language models have a shared …

Redistributing low-frequency words: Making the most of monolingual data in non-autoregressive translation

L Ding, L Wang, S Shi, D Tao, Z Tu - … of the 60th Annual Meeting of …, 2022 - aclanthology.org
Abstract Knowledge distillation (KD) is the preliminary step for training non-autoregressive
translation (NAT) models, which eases the training of NAT models at the cost of losing …

Glancing transformer for non-autoregressive neural machine translation

L Qian, H Zhou, Y Bao, M Wang, L Qiu… - arXiv preprint arXiv …, 2020 - arxiv.org
Recent work on non-autoregressive neural machine translation (NAT) aims at improving the
efficiency by parallel decoding without sacrificing the quality. However, existing NAT …

Learning to efficiently sample from diffusion probabilistic models

D Watson, J Ho, M Norouzi, W Chan - arXiv preprint arXiv:2106.03802, 2021 - arxiv.org
Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a powerful family of
generative models that can yield high-fidelity samples and competitive log-likelihoods …

SmBoP: Semi-autoregressive bottom-up semantic parsing

O Rubin, J Berant - arXiv preprint arXiv:2010.12412, 2020 - arxiv.org
The de-facto standard decoding method for semantic parsing in recent years has been to
autoregressively decode the abstract syntax tree of the target program using a top-down …

Step-unrolled denoising autoencoders for text generation

N Savinov, J Chung, M Binkowski, E Elsen… - arXiv preprint arXiv …, 2021 - arxiv.org
In this paper we propose a new generative model of text, Step-unrolled Denoising
Autoencoder (SUNDAE), that does not rely on autoregressive models. Similarly to denoising …