On Multi-Domain Training and Adaptation of End-to-End RNN Acoustic Models for Distant Speech...

A Narayanan, A Misra, KC Sim… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org

Current state-of-the-art automatic speech recognition systems are trained to work in
specificdomains', defined based on factors like application, sampling rate and codec. When …

被引用次数：113 相关文章所有 5 个版本

[PDF] arxiv.org

A conformer-based asr frontend for joint acoustic echo cancellation, speech enhancement and speech separation

T O'Malley, A Narayanan, Q Wang… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

We present a frontend for improving robustness of automatic speech recognition (ASR), that
jointly implements three modules within a single model: acoustic echo cancellation, speech …

被引用次数：28 相关文章所有 3 个版本

[PDF] arxiv.org

Cross-attention conformer for context modeling in speech enhancement for ASR

A Narayanan, CC Chiu, T O'Malley… - 2021 IEEE Automatic …, 2021 - ieeexplore.ieee.org

This work introduces cross-attention conformer, an attention-based architecture for context
modeling in speech enhancement. Given that the context information can often be …

被引用次数：14 相关文章所有 3 个版本

[PDF] arxiv.org

Leveraging native language information for improved accented speech recognition

S Ghorbani, JHL Hansen - arXiv preprint arXiv:1904.09038, 2019 - arxiv.org

Recognition of accented speech is a long-standing challenge for automatic speech
recognition (ASR) systems, given the increasing worldwide population of bi-lingual speakers …

被引用次数：30 相关文章所有 8 个版本

[PDF] arxiv.org

Speaker adaptation for end-to-end CTC models

K Li, J Li, Y Zhao, K Kumar… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org

We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech
recognition systems. One is Kullback-Leibler divergence (KLD) regularization and the other …

被引用次数：27 相关文章所有 4 个版本

[PDF] arxiv.org

Updating only encoders prevents catastrophic forgetting of end-to-end ASR models

Y Takashima, S Horiguchi, S Watanabe… - arXiv preprint arXiv …, 2022 - arxiv.org

In this paper, we present an incremental domain adaptation technique to prevent
catastrophic forgetting for an end-to-end automatic speech recognition (ASR) model …

被引用次数：6 相关文章所有 8 个版本

[PDF] sciencedirect.com

Multi-domain adversarial training of neural network acoustic models for distant speech recognition

S Mirsamadi, JHL Hansen - Speech Communication, 2019 - Elsevier

Building deep neural network acoustic models directly based on far-field speech from
multiple recording environments with different acoustic properties is an increasingly popular …

被引用次数：25 相关文章所有 2 个版本

[PDF] academia.edu

Domain adaptation of end-to-end speech recognition in low-resource settings

L Samarakoon, B Mak, AYS Lam - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org

End-to-end automatic speech recognition (ASR) has simplified the traditional ASR system
building pipeline by eliminating the need to have multiple components and also the …

被引用次数：22 相关文章所有 4 个版本

[PDF] arxiv.org

Advancing multi-accented lstm-ctc speech recognition using a domain specific student-teacher learning paradigm

S Ghorbani, AE Bulut… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org

Non-native speech causes automatic speech recognition systems to degrade in
performance. Past strategies to address this challenge have considered model adaptation …

被引用次数：21 相关文章所有 4 个版本

Conditional conformer: Improving speaker modulation for single and multi-user speech enhancement

T O'Malley, S Ding, A Narayanan… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

Recently, Feature-wise Linear Modulation (FiLM) has been shown to outperform other
approaches to incorporate speaker embedding into speech separation and VoiceFilter …

被引用次数：3 相关文章