- 学术资源搜索

Neural machine translation: A review

F Stahlberg - Journal of Artificial Intelligence Research, 2020 - jair.org

The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …

被引用次数：409 相关文章所有 7 个版本

[PDF] arxiv.org

Fairseq S2T: Fast speech-to-text modeling with fairseq

C Wang, Y Tang, X Ma, A Wu, S Popuri… - arXiv preprint arXiv …, 2020 - arxiv.org

We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such
as end-to-end speech recognition and speech-to-text translation. It follows fairseq's careful …

被引用次数：253 相关文章所有 5 个版本

[PDF] arxiv.org

High fidelity speech synthesis with adversarial networks

M Bińkowski, J Donahue, S Dieleman, A Clark… - arXiv preprint arXiv …, 2019 - arxiv.org

Generative adversarial networks have seen rapid development in recent years and have led
to remarkable improvements in generative modelling of images. However, their application …

被引用次数：294 相关文章所有 4 个版本

[PDF] arxiv.org

Nemo: a toolkit for building ai applications using neural modules

O Kuchaiev, J Li, H Nguyen, O Hrinchuk… - arXiv preprint arXiv …, 2019 - arxiv.org

NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications
through re-usability, abstraction, and composition. NeMo is built around neural modules …

被引用次数：275 相关文章所有 3 个版本

[PDF] arxiv.org

Jasper: An end-to-end convolutional neural acoustic model

J Li, V Lavrukhin, B Ginsburg, R Leary… - arXiv preprint arXiv …, 2019 - arxiv.org

In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech
recognition models without any external training data. Our model, Jasper, uses only 1D …

被引用次数：288 相关文章所有 8 个版本

[PDF] arxiv.org

ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit

T Hayashi, R Yamamoto, K Inoue… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-
TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit …

被引用次数：236 相关文章所有 7 个版本

[PDF] arxiv.org

Transformers in speech processing: A survey

S Latif, A Zaidi, H Cuayahuitl, F Shamshad… - arXiv preprint arXiv …, 2023 - arxiv.org

The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …

被引用次数：57 相关文章所有 4 个版本

[PDF] arxiv.org

Wav2letter++: A fast open-source speech recognition system

V Pratap, A Hannun, Q Xu, J Cai, J Kahn… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

This paper introduces wav2letter++, a fast open-source deep learning speech recognition
framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for …

被引用次数：235 相关文章所有 8 个版本

[PDF] arxiv.org

ESPnet-ST: All-in-one speech translation toolkit

H Inaguma, S Kiyono, K Duh, S Karita… - arXiv preprint arXiv …, 2020 - arxiv.org

We present ESPnet-ST, which is designed for the quick development of speech-to-speech
translation systems in a single framework. ESPnet-ST is a new project inside end-to-end …

被引用次数：172 相关文章所有 6 个版本

[PDF] arxiv.org

Learning robust and multilingual speech representations

K Kawakami, L Wang, C Dyer, P Blunsom… - arXiv preprint arXiv …, 2020 - arxiv.org

Unsupervised speech representation learning has shown remarkable success at finding
representations that correlate with phonetic structures and improve downstream speech …

被引用次数：104 相关文章所有 3 个版本