Neural machine translation: A review
F Stahlberg - Journal of Artificial Intelligence Research, 2020 - jair.org
The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …
natural language into another, has experienced a major paradigm shift in recent years …
Fairseq S2T: Fast speech-to-text modeling with fairseq
We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such
as end-to-end speech recognition and speech-to-text translation. It follows fairseq's careful …
as end-to-end speech recognition and speech-to-text translation. It follows fairseq's careful …
High fidelity speech synthesis with adversarial networks
Generative adversarial networks have seen rapid development in recent years and have led
to remarkable improvements in generative modelling of images. However, their application …
to remarkable improvements in generative modelling of images. However, their application …
Nemo: a toolkit for building ai applications using neural modules
NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications
through re-usability, abstraction, and composition. NeMo is built around neural modules …
through re-usability, abstraction, and composition. NeMo is built around neural modules …
Jasper: An end-to-end convolutional neural acoustic model
In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech
recognition models without any external training data. Our model, Jasper, uses only 1D …
recognition models without any external training data. Our model, Jasper, uses only 1D …
ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit
T Hayashi, R Yamamoto, K Inoue… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-
TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit …
TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit …
Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …
Wav2letter++: A fast open-source speech recognition system
This paper introduces wav2letter++, a fast open-source deep learning speech recognition
framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for …
framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for …
ESPnet-ST: All-in-one speech translation toolkit
We present ESPnet-ST, which is designed for the quick development of speech-to-speech
translation systems in a single framework. ESPnet-ST is a new project inside end-to-end …
translation systems in a single framework. ESPnet-ST is a new project inside end-to-end …
Learning robust and multilingual speech representations
Unsupervised speech representation learning has shown remarkable success at finding
representations that correlate with phonetic structures and improve downstream speech …
representations that correlate with phonetic structures and improve downstream speech …