On Speaker Attribution with SURT
D Raj, M Wiesner, M Maciejewski… - arXiv preprint arXiv …, 2024 - arxiv.org
The Streaming Unmixing and Recognition Transducer (SURT) has recently become a
popular framework for continuous, streaming, multi-talker speech recognition (ASR). With …
popular framework for continuous, streaming, multi-talker speech recognition (ASR). With …
LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization
The evolving speech processing landscape is increasingly focused on complex scenarios
like meetings or cocktail parties with multiple simultaneous speakers and far-field conditions …
like meetings or cocktail parties with multiple simultaneous speakers and far-field conditions …
Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR
Joint modeling of multi-speaker ASR and speaker diarization has recently shown promising
results in speaker-attributed automatic speech recognition (SA-ASR). Although being able to …
results in speaker-attributed automatic speech recognition (SA-ASR). Although being able to …
Advancing Multi-talker ASR Performance with Large Language Models
Recognizing overlapping speech from multiple speakers in conversational scenarios is one
of the most challenging problem for automatic speech recognition (ASR). Serialized output …
of the most challenging problem for automatic speech recognition (ASR). Serialized output …
AG-LSEC: Audio Grounded Lexical Speaker Error Correction
Speaker Diarization (SD) systems are typically audio-based and operate independently of
the ASR system in traditional speech transcription pipelines and can have speaker errors …
the ASR system in traditional speech transcription pipelines and can have speaker errors …