Transformer: A general framework from machine translation to others

Y Zhao, J Zhang, C Zong - Machine Intelligence Research, 2023 - Springer
Abstract Machine translation is an important and challenging task that aims at automatically
translating natural language sentences from one language into another. Recently …

The multilingual tedx corpus for speech recognition and translation

E Salesky, M Wiesner, J Bremerman, R Cattoni… - arXiv preprint arXiv …, 2021 - arxiv.org
We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and
speech translation (ST) research across many non-English source languages. The corpus is …

Improving speech translation by understanding and learning from the auxiliary text translation task

Y Tang, J Pino, X Li, C Wang, D Genzel - arXiv preprint arXiv:2107.05782, 2021 - arxiv.org
Pretraining and multitask learning are widely used to improve the speech to text translation
performance. In this study, we are interested in training a speech to text translation model …

A general multi-task learning framework to leverage text data for speech to text tasks

Y Tang, J Pino, C Wang, X Ma… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Attention-based sequence-to-sequence modeling provides a powerful and elegant solution
for applications that need to map one sequence to a different sequence. Its success heavily …

Listen, understand and translate: Triple supervision decouples end-to-end speech-to-text translation

Q Dong, R Ye, M Wang, H Zhou, S Xu, B Xu… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs
the text in a target language. Existing methods are limited by the amount of parallel corpus …

CTC-based compression for direct speech translation

M Gaido, M Cettolo, M Negri, M Turchi - arXiv preprint arXiv:2102.01578, 2021 - arxiv.org
Previous studies demonstrated that a dynamic phone-informed compression of the input
audio is beneficial for speech translation (ST). However, they required a dedicated model for …

Consecutive decoding for speech-to-text translation

Q Dong, M Wang, H Zhou, S Xu, B Xu… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Speech-to-text translation (ST), which directly translates the source language speech to the
target language text, has attracted intensive attention recently. However, the combination of …

RealTranS: End-to-end simultaneous speech translation with convolutional weighted-shrinking transformer

X Zeng, L Li, Q Liu - arXiv preprint arXiv:2106.04833, 2021 - arxiv.org
End-to-end simultaneous speech translation (SST), which directly translates speech in one
language into text in another language in real-time, is useful in many scenarios but has not …

M-adapter: Modality adaptation for end-to-end speech-to-text translation

J Zhao, H Yang, E Shareghi, G Haffari - arXiv preprint arXiv:2207.00952, 2022 - arxiv.org
End-to-end speech-to-text translation models are often initialized with pre-trained speech
encoder and pre-trained text decoder. This leads to a significant training gap between pre …

Orthros: Non-autoregressive end-to-end speech translation with dual-decoder

H Inaguma, Y Higuchi, K Duh… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Fast inference speed is an important goal towards real-world deployment of speech
translation (ST) systems. End-to-end (E2E) models based on the encoder-decoder …