The NiuTrans system for WNGT 2020 efficiency task- 学术资源搜索

The NiuTrans system for WNGT 2020 efficiency task

C Hu, B Li, Y Lin, Y Li, Y Li, C Wang, T Xiao… - arXiv preprint arXiv …, 2021 - arxiv.org

C Hu, B Li, Y Lin, Y Li, Y Li, C Wang, T Xiao, J Zhu

arXiv preprint arXiv:2109.08008, 2021•arxiv.org

This paper describes the submissions of the NiuTrans Team to the WNGT 2020 Efficiency Shared Task. We focus on the efficient implementation of deep Transformer models \cite{wang-etal-2019-learning, li-etal-2019-niutrans} using NiuTensor (https://github.com/NiuTrans/NiuTensor), a flexible toolkit for NLP tasks. We explored the combination of deep encoder and shallow decoder in Transformer models via model compression and knowledge distillation. The neural machine translation decoding also benefits from FP16 inference, attention caching, dynamic batching, and batch pruning. Our systems achieve promising results in both translation quality and efficiency, e.g., our fastest system can translate more than 40,000 tokens per second with an RTX 2080 Ti while maintaining 42.9 BLEU on \textit{newstest2018}. The code, models, and docker images are available at NiuTrans.NMT (https://github.com/NiuTrans/NiuTrans.NMT).

arxiv.org

展开收起

被引用次数：4 相关文章所有 6 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果