Neural machine translation: A review

F Stahlberg - Journal of Artificial Intelligence Research, 2020 - jair.org
The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …

[HTML][HTML] Deep learning in electron microscopy

JM Ede - Machine Learning: Science and Technology, 2021 - iopscience.iop.org
Deep learning is transforming most areas of science and technology, including electron
microscopy. This review paper offers a practical perspective aimed at developers with …

Masked language model scoring

J Salazar, D Liang, TQ Nguyen, K Kirchhoff - arXiv preprint arXiv …, 2019 - arxiv.org
Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead,
we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are …

Revisiting low-resource neural machine translation: A case study

R Sennrich, B Zhang - arXiv preprint arXiv:1905.11901, 2019 - arxiv.org
It has been shown that the performance of neural machine translation (NMT) drops starkly in
low-resource conditions, underperforming phrase-based statistical machine translation …

Root mean square layer normalization

B Zhang, R Sennrich - Advances in Neural Information …, 2019 - proceedings.neurips.cc
Layer normalization (LayerNorm) has been successfully applied to various deep neural
networks to help stabilize training and boost model convergence because of its capability in …

Transformers without tears: Improving the normalization of self-attention

TQ Nguyen, J Salazar - arXiv preprint arXiv:1910.05895, 2019 - arxiv.org
We evaluate three simple, normalization-centric changes to improve Transformer training.
First, we show that pre-norm residual connections (PreNorm) and smaller initializations …

Understanding and improving lexical choice in non-autoregressive translation

L Ding, L Wang, X Liu, DF Wong, D Tao… - arXiv preprint arXiv …, 2020 - arxiv.org
Knowledge distillation (KD) is essential for training non-autoregressive translation (NAT)
models by reducing the complexity of the raw data with an autoregressive teacher model. In …

Domain adaptation and multi-domain adaptation for neural machine translation: A survey

D Saunders - Journal of Artificial Intelligence Research, 2022 - jair.org
The development of deep learning techniques has allowed Neural Machine Translation
(NMT) models to become extremely powerful, given sufficient training data and training time …

Self-attentional acoustic models

M Sperber, J Niehues, G Neubig, S Stüker… - arXiv preprint arXiv …, 2018 - arxiv.org
Self-attention is a method of encoding sequences of vectors by relating these vectors to
each-other based on pairwise similarities. These models have recently shown promising …

Variational graph normalized autoencoders

SJ Ahn, MH Kim - Proceedings of the 30th ACM international conference …, 2021 - dl.acm.org
Link prediction is one of the key problems for graph-structured data. With the advancement
of graph neural networks, graph autoencoders (GAEs) and variational graph autoencoders …