Robust multi-channel speech recognition using frequency aligned network

Q Zhu, J Zhang, Y Gu, Y Hu, L Dai - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Self-supervised speech pre-training methods have developed rapidly in recent years, which
show to be very effective for many near-field single-channel speech tasks. However, far-field …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Self-attention channel combinator frontend for end-to-end multichannel far-field speech recognition

R Gong, C Quillen, D Sharma, A Goderre… - arXiv preprint arXiv …, 2021 - arxiv.org

When a sufficiently large far-field training data is presented, jointly optimizing a multichannel
frontend and an end-to-end (E2E) Automatic Speech Recognition (ASR) backend shows …

被引用次数：15 相关文章所有 8 个版本

[PDF] arxiv.org

Channel-combination algorithms for robust distant voice activity and overlapped speech detection

T Mariotte, A Larcher, S Montrésor… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org

Voice Activity Detection (VAD) and Overlapped Speech Detection (OSD) are key pre-
processing tasks for speaker diarization. In the meeting context, it is often easier to capture …

被引用次数：1 相关文章所有 7 个版本

[PDF] hal.science

Microphone array channel combination algorithms for overlapped speech detection

T Mariotte, A Larcher, S Montrésor… - … 2022 Human and …, 2022 - univ-lemans.hal.science

Overlapped speech occurs when multiple speakers are simultaneously active. This may
lead to severe performance degradation in automatic speech processing systems such as …

被引用次数：5 相关文章所有 9 个版本

Far-field speech recognition based on complex-valued neural networks and inter-frame similarity difference method

Y Guo, Y Chen, G Cheng, P Zhang… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

Far-field automatic speech recognition (ASR) is a challenging task due to the background
noise and reverberation. To address this issue, we introduce a novel end-to-end multi …

被引用次数：5 相关文章

[PDF] isca-archive.org

[PDF][PDF] Multi-channel multi-speaker transformer for speech recognition

G Yifan, T Yao, S Hongbin, W Yulong - Proc. INTERSPEECH 2023, 2023 - isca-archive.org

With the development of teleconferencing and in-vehicle voice assistants, far-field multi-
speaker speech recognition has become a hot research topic. Recently, a multi-channel …

Traitement automatique de la parole en réunion par dissémination de capteurs

T Mariotte - 2024 - theses.hal.science

Ces travaux de thèse se concentrent sur le traitement automatique de la parole, et plus
particulièrement sur la diarisation en locuteurs. Cette tâche nécessite de segmenter le …

ChannelAugment: Improving generalization of multi-channel ASR by training with input channel randomization

M Gaudesi, F Weninger, D Sharma… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

End-to-end (E2E) multi-channel ASR systems show state-of-the-art performance in far-field
ASR tasks by joint training of a multi-channel front-end along with the ASR model. The main …