Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …
Information-transport-based policy for simultaneous translation
S Zhang, Y Feng - arXiv preprint arXiv:2210.12357, 2022 - arxiv.org
Simultaneous translation (ST) outputs translation while receiving the source inputs, and
hence requires a policy to determine whether to translate a target token or wait for the next …
hence requires a policy to determine whether to translate a target token or wait for the next …
Unified segment-to-segment framework for simultaneous sequence generation
S Zhang, Y Feng - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Simultaneous sequence generation is a pivotal task for real-time scenarios, such as
streaming speech recognition, simultaneous machine translation and simultaneous speech …
streaming speech recognition, simultaneous machine translation and simultaneous speech …
End-to-End Speech-to-Text Translation: A Survey
N Sethiya, CK Maurya - arXiv preprint arXiv:2312.01053, 2023 - arxiv.org
Speech-to-text translation pertains to the task of converting speech signals in a language to
text in another language. It finds its application in various domains, such as hands-free …
text in another language. It finds its application in various domains, such as hands-free …
Attention as a guide for simultaneous speech translation
The study of the attention mechanism has sparked interest in many fields, such as language
modeling and machine translation. Although its patterns have been exploited to perform …
modeling and machine translation. Although its patterns have been exploited to perform …
Over-generation cannot be rewarded: Length-adaptive average lagging for simultaneous speech translation
Simultaneous speech translation (SimulST) systems aim at generating their output with the
lowest possible latency, which is normally computed in terms of Average Lagging (AL). In …
lowest possible latency, which is normally computed in terms of Average Lagging (AL). In …
Learning when to translate for streaming speech
How to find proper moments to generate partial sentence translation given a streaming
speech input? Existing approaches waiting-and-translating for a fixed duration often break …
speech input? Existing approaches waiting-and-translating for a fixed duration often break …
Learning adaptive segmentation policy for end-to-end simultaneous translation
End-to-end simultaneous speech-to-text translation aims to directly perform translation from
streaming source speech to target text with high translation quality and low latency. A typical …
streaming source speech to target text with high translation quality and low latency. A typical …
Alignatt: Using attention-based audio-translation alignments as a guide for simultaneous speech translation
Attention is the core mechanism of today's most used architectures for natural language
processing and has been analyzed from many perspectives, including its effectiveness for …
processing and has been analyzed from many perspectives, including its effectiveness for …
Recent Advances in End-to-End Simultaneous Speech Translation
Simultaneous speech translation (SimulST) is a demanding task that involves generating
translations in real-time while continuously processing speech input. This paper offers a …
translations in real-time while continuously processing speech input. This paper offers a …