Speeding up neural machine translation decoding by shrinking run-time vocabulary

X Miao, G Oliaro, Z Zhang, X Cheng, H Jin… - arXiv preprint arXiv …, 2023 - arxiv.org

In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …

被引用次数：65 相关文章所有 2 个版本

[PDF] arxiv.org

Deep encoder, shallow decoder: Reevaluating non-autoregressive machine translation

J Kasai, N Pappas, H Peng, J Cross… - arXiv preprint arXiv …, 2020 - arxiv.org

Much recent effort has been invested in non-autoregressive neural machine translation,
which appears to be an efficient alternative to state-of-the-art autoregressive machine …

被引用次数：178 相关文章所有 5 个版本

[PDF] acm.org

BLADE: combining vocabulary pruning and intermediate pretraining for scaleable neural CLIR

S Nair, E Yang, D Lawrie, J Mayfield… - Proceedings of the 46th …, 2023 - dl.acm.org

Learning sparse representations using pretrained language models enhances the
monolingual ranking effectiveness. Such representations are sparse vectors in the …

被引用次数：9 相关文章所有 4 个版本

[PDF] arxiv.org

Hard non-monotonic attention for character-level transduction

S Wu, P Shapiro, R Cotterell - arXiv preprint arXiv:1808.10024, 2018 - arxiv.org

Character-level string-to-string transduction is an important component of various NLP tasks.
The goal is to map an input string to an output string, where the strings may be of different …

被引用次数：55 相关文章所有 3 个版本

[PDF] arxiv.org

Multilingual neural machine translation with deep encoder and multiple shallow decoders

X Kong, A Renduchintala, J Cross, Y Tang… - arXiv preprint arXiv …, 2022 - arxiv.org

Recent work in multilingual translation advances translation quality surpassing bilingual
baselines using deep transformer models with increased capacity. However, the extra …

被引用次数：29 相关文章所有 4 个版本

[PDF] aclanthology.org

Game-theoretic vocabulary selection via the shapley value and banzhaf index

R Patel, M Garnelo, I Gemp, C Dyer… - Proceedings of the …, 2021 - aclanthology.org

The input vocabulary and the representations learned are crucial to the performance of
neural NLP models. Using the full vocabulary results in less explainable and more memory …

被引用次数：22 相关文章所有 2 个版本

[PDF] arxiv.org

Efficient inference for multilingual neural machine translation

A Bérard, D Lee, S Clinchant, K Jung… - arXiv preprint arXiv …, 2021 - arxiv.org

Multilingual NMT has become an attractive solution for MT deployment in production. But to
match bilingual quality, it comes at the cost of larger and slower models. In this work, we …

被引用次数：13 相关文章所有 4 个版本

[PDF] aclanthology.org

OpenNMT system description for WNMT 2018: 800 words/sec on a single-core CPU

J Senellart, D Zhang, B Wang, G Klein… - Proceedings of the …, 2018 - aclanthology.org

We present a system description of the OpenNMT Neural Machine Translation entry for the
WNMT 2018 evaluation. In this work, we developed a heavily optimized NMT inference …

被引用次数：25 相关文章所有 3 个版本

[PDF] arxiv.org

The devil is in the details: on the pitfalls of vocabulary selection in neural machine translation

T Domhan, E Hasler, K Tran, S Trenous… - arXiv preprint arXiv …, 2022 - arxiv.org

Vocabulary selection, or lexical shortlisting, is a well-known technique to improve latency of
Neural Machine Translation models by constraining the set of allowed output words during …

被引用次数：9 相关文章所有 5 个版本

[PDF] aclanthology.org

Locality-sensitive hashing for long context neural machine translation

F Petrick, J Rosendahl, C Herold… - Proceedings of the 19th …, 2022 - aclanthology.org

After its introduction the Transformer architecture quickly became the gold standard for the
task of neural machine translation. A major advantage of the Transformer compared to …

被引用次数：5 相关文章所有 5 个版本