Non-autoregressive machine translation with latent alignments

X Li, J Thickstun, I Gulrajani… - Advances in Neural …, 2022 - proceedings.neurips.cc

Controlling the behavior of language models (LMs) without re-training is a major open
problem in natural language generation. While recent works have demonstrated successes …

被引用次数：700 相关文章所有 7 个版本网页快照

[PDF] sci-hub [PDF] neurips.cc [ 下载加速 ]

Structured denoising diffusion models in discrete state-spaces

J Austin, DD Johnson, J Ho, D Tarlow… - Advances in …, 2021 - proceedings.neurips.cc

Denoising diffusion probabilistic models (DDPMs)[Ho et al. 2021] have shown impressive
results on image and waveform generation in continuous state spaces. Here, we introduce …

被引用次数：784 相关文章所有 9 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

Wavegrad: Estimating gradients for waveform generation

N Chen, Y Zhang, H Zen, RJ Weiss, M Norouzi… - arXiv preprint arXiv …, 2020 - arxiv.org

This paper introduces WaveGrad, a conditional model for waveform generation which
estimates gradients of the data density. The model is built on prior work on score matching …

被引用次数：819 相关文章所有 6 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

A survey on non-autoregressive generation for neural machine translation and beyond

Y Xiao, L Wu, J Guo, J Li, M Zhang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Non-autoregressive (NAR) generation, which is first proposed in neural machine translation
(NMT) to speed up inference, has attracted much attention in both machine learning and …

被引用次数：90 相关文章所有 8 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

Diffusionbert: Improving generative masked language models with diffusion models

Z He, T Sun, K Wang, X Huang, X Qiu - arXiv preprint arXiv:2211.15029, 2022 - arxiv.org

We present DiffusionBERT, a new generative masked language model based on discrete
diffusion models. Diffusion models and many pre-trained language models have a shared …

被引用次数：92 相关文章所有 7 个版本网页快照

[PDF] sci-hub [PDF] aclanthology.org [ 下载加速 ]

Redistributing low-frequency words: Making the most of monolingual data in non-autoregressive translation

L Ding, L Wang, S Shi, D Tao, Z Tu - … of the 60th Annual Meeting of …, 2022 - aclanthology.org

Abstract Knowledge distillation (KD) is the preliminary step for training non-autoregressive
translation (NAT) models, which eases the training of NAT models at the cost of losing …

被引用次数：62 相关文章所有 2 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

Glancing transformer for non-autoregressive neural machine translation

L Qian, H Zhou, Y Bao, M Wang, L Qiu… - arXiv preprint arXiv …, 2020 - arxiv.org

Recent work on non-autoregressive neural machine translation (NAT) aims at improving the
efficiency by parallel decoding without sacrificing the quality. However, existing NAT …

被引用次数：158 相关文章所有 10 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

Learning to efficiently sample from diffusion probabilistic models

D Watson, J Ho, M Norouzi, W Chan - arXiv preprint arXiv:2106.03802, 2021 - arxiv.org

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a powerful family of
generative models that can yield high-fidelity samples and competitive log-likelihoods …

被引用次数：127 相关文章所有 3 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

SmBoP: Semi-autoregressive bottom-up semantic parsing

O Rubin, J Berant - arXiv preprint arXiv:2010.12412, 2020 - arxiv.org

The de-facto standard decoding method for semantic parsing in recent years has been to
autoregressively decode the abstract syntax tree of the target program using a top-down …

被引用次数：141 相关文章所有 5 个版本网页快照

[PDF] sci-hub [PDF] arxiv.org [ 下载加速 ]

Step-unrolled denoising autoencoders for text generation

N Savinov, J Chung, M Binkowski, E Elsen… - arXiv preprint arXiv …, 2021 - arxiv.org

In this paper we propose a new generative model of text, Step-unrolled Denoising
Autoencoder (SUNDAE), that does not rely on autoregressive models. Similarly to denoising …

被引用次数：93 相关文章所有 3 个版本网页快照