SegAugment: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

I Tsiamas, JAR Fonollosa, MR Costa-jussà - arXiv preprint arXiv …, 2022 - arxiv.org
End-to-end Speech Translation is hindered by a lack of available data resources. While
most of them are based on documents, a sentence-level version is available, which is …

[PDF][PDF] Revamping the SLTev Tool for Evaluation of Spoken Language Translation

M Elizabeth, O Bojar - The Prague Bulletin of Mathematical …, 2024 - ufal.mff.cuni.cz
This article describes recent improvements of SLTev, a tool for automatic evaluation of
machine translation, speech recognition and speech translation systems. The changes …

[PDF][PDF] Manipulating Data Representations for Neural Machine Translation

C Amrhein - 2023 - zora.uzh.ch
In natural language processing, much current research focuses on training larger and larger
models on more and more data. In this thesis, we argue that how data is represented can …

[PDF][PDF] How “Real” is Your Real-Time Simultaneous Speech-to-Text Translation System?

S Papi, P Polák, D Machácek, O Bojar - arxiv.org
Simultaneous speech-to-text translation (SimulST) translates source-language speech into
target-language text concurrently with the speaker's speech, ensuring low latency for better …