[PDF][PDF] Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.

J Deng, FR Gutierrez, S Hu, M Geng, X Xie, Z Ye… - Interspeech, 2021 - se.cuhk.edu.hk
Automatic recognition of elderly and disordered speech remains a highly challenging task to
date. Such data is not only difficult to collect in large quantities, but also exhibits a significant …

Bayesian learning of LF-MMI trained time delay neural networks for speech recognition

S Hu, X Xie, S Liu, J Yu, Z Ye, M Geng… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org
Discriminative training techniques define state-of-the-art performance for automatic speech
recognition systems. However, they are inherently prone to overfitting, leading to poor …

Domain adaptation of lattice-free MMI based TDNN models for speech recognition

Y Long, Y Li, H Ye, H Mao - International Journal of Speech Technology, 2017 - Springer
The recent proposed time-delay deep neural network (TDNN) acoustic models trained with
lattice-free maximum mutual information (LF-MMI) criterion have been shown to give …

Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition

S Hu, X Xie, M Geng, Z Jin, J Deng, G Li… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Self-supervised learning (SSL) based speech foundation models have been applied to a
wide range of ASR tasks. However, their application to dysarthric and elderly speech via …

Development of the cuhk elderly speech recognition system for neurocognitive disorder detection using the dementiabank corpus

Z Ye, S Hu, J Li, X Xie, M Geng, J Yu… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Early diagnosis of Neurocognitive Disorder (NCD) is crucial in facilitating preventive care
and timely treatment to delay further progression. This paper presents the development of a …

Exploring self-supervised pre-trained asr models for dysarthric and elderly speech recognition

S Hu, X Xie, Z Jin, M Geng, Y Wang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Automatic recognition of disordered and elderly speech remains a highly challenging task to
date due to the difficulty in collecting such data in large quantities. This paper explores a …

Neural architecture search for LF-MMI trained time delay neural networks

S Hu, X Xie, M Cui, J Deng, S Liu, J Yu… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
State-of-the-art automatic speech recognition (ASR) system development is data and
computation intensive. The optimal design of deep neural networks (DNNs) for these …

Speaker adaptation using spectro-temporal deep features for dysarthric and elderly speech recognition

M Geng, X Xie, Z Ye, T Wang, G Li, S Hu… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org
Despite the rapid progress of automatic speech recognition (ASR) technologies targeting
normal speech in recent decades, accurate recognition of dysarthric and elderly speech …

Bayesian transformer language models for speech recognition

B Xue, J Yu, J Xu, S Liu, S Hu, Z Ye… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
State-of-the-art neural language models (LMs) represented by Transformers are highly
complex. Their use of fixed, deterministic parameter estimates fail to account for model …

[PDF][PDF] Fast DNN Acoustic Model Speaker Adaptation by Learning Hidden Unit Contribution Features.

X Xie, X Liu, T Lee, L Wang - INTERSPEECH, 2019 - isca-archive.org
Speaker adaptation techniques play a key role in reducing the mismatch between automatic
speech recognition (ASR) systems and target users. Deep neural network (DNN) acoustic …