Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation
Self-supervised speech pre-training methods have developed rapidly in recent years, which
show to be very effective for many near-field single-channel speech tasks. However, far-field …
show to be very effective for many near-field single-channel speech tasks. However, far-field …
Self-attention channel combinator frontend for end-to-end multichannel far-field speech recognition
When a sufficiently large far-field training data is presented, jointly optimizing a multichannel
frontend and an end-to-end (E2E) Automatic Speech Recognition (ASR) backend shows …
frontend and an end-to-end (E2E) Automatic Speech Recognition (ASR) backend shows …
Channel-combination algorithms for robust distant voice activity and overlapped speech detection
Voice Activity Detection (VAD) and Overlapped Speech Detection (OSD) are key pre-
processing tasks for speaker diarization. In the meeting context, it is often easier to capture …
processing tasks for speaker diarization. In the meeting context, it is often easier to capture …
Microphone array channel combination algorithms for overlapped speech detection
Overlapped speech occurs when multiple speakers are simultaneously active. This may
lead to severe performance degradation in automatic speech processing systems such as …
lead to severe performance degradation in automatic speech processing systems such as …
Far-field speech recognition based on complex-valued neural networks and inter-frame similarity difference method
Far-field automatic speech recognition (ASR) is a challenging task due to the background
noise and reverberation. To address this issue, we introduce a novel end-to-end multi …
noise and reverberation. To address this issue, we introduce a novel end-to-end multi …
[PDF][PDF] Multi-channel multi-speaker transformer for speech recognition
G Yifan, T Yao, S Hongbin, W Yulong - Proc. INTERSPEECH 2023, 2023 - isca-archive.org
With the development of teleconferencing and in-vehicle voice assistants, far-field multi-
speaker speech recognition has become a hot research topic. Recently, a multi-channel …
speaker speech recognition has become a hot research topic. Recently, a multi-channel …
Traitement automatique de la parole en réunion par dissémination de capteurs
T Mariotte - 2024 - theses.hal.science
Ces travaux de thèse se concentrent sur le traitement automatique de la parole, et plus
particulièrement sur la diarisation en locuteurs. Cette tâche nécessite de segmenter le …
particulièrement sur la diarisation en locuteurs. Cette tâche nécessite de segmenter le …
ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization
M Gaudesi, F Weninger, D Sharma… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org
End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field
ASR tasks by joint training of a multi-channel front-end along with the ASR model. The main …
ASR tasks by joint training of a multi-channel front-end along with the ASR model. The main …