[HTML][HTML] Environmentally robust ASR front-end for deep neural network acoustic models

T Yoshioka, MJF Gales - Computer Speech & Language, 2015 - Elsevier
This paper examines the individual and combined impacts of various front-end approaches
on the performance of deep neural network (DNN) based speech recognition systems in …

Investigation of unsupervised adaptation of DNN acoustic models with filter bank input

T Yoshioka, A Ragni, MJF Gales - 2014 IEEE International …, 2014 - ieeexplore.ieee.org
Adaptation to speaker variations is an essential component of speech recognition systems.
One common approach to adapting deep neural network (DNN) acoustic models is to …

Study of statistical robust closed set speaker identification with feature and score-based fusion

MTS Al-Kaltakchi, WL Woo, SS Dlay… - 2016 IEEE statistical …, 2016 - ieeexplore.ieee.org
In this paper, the statistical combination of Power Normalization Cepstral Coefficient (PNCC)
and Mel Frequency Cepstral Coefficient (MFCC) features in robust closed set speaker …

Spectro-temporal power spectrum features for noise robust ASR

H Riazati Seresht, SM Ahadi, S Seyedin - Circuits, Systems, and Signal …, 2017 - Springer
In this paper, we present a new technique to extract a noise robust representation of speech
signals called spectro-temporal power spectrum. This technique is based on applying a …

A compact CNN-based speech enhancement with adaptive filter design using gabor function and region-aware convolution

S Abdullah, M Zamani, A Demosthenous - IEEE Access, 2022 - ieeexplore.ieee.org
Speech enhancement (SE) is used in many applications, such as hearing devices, to
improve speech intelligibility and quality. Convolutional neural network-based (CNN-based) …

Kernel power flow orientation coefficients for noise-robust speech recognition

B Gerazov, Z Ivanovski - IEEE/ACM Transactions on Audio …, 2014 - ieeexplore.ieee.org
Noise-robustness has become a crucial parameter in Automatic Speech Recognition (ASR)
systems today with their increased use in noise-filled real-world environments. One way to …

On the importance of modeling and robustness for deep neural network feature

SY Chang, S Wegmann - 2015 IEEE International Conference …, 2015 - ieeexplore.ieee.org
A large body of research has shown that acoustic features for speech recognition can be
learned from data using neural networks with multiple hidden layers (DNNs) and that these …

[PDF][PDF] Comparing time-frequency representations for directional derivative features.

J Gibson, M Van Segbroeck, SS Narayanan - INTERSPEECH, 2014 - ict.usc.edu
We compare the performance of Directional Derivatives features for automatic speech
recognition when extracted from different time-frequency representations. Specifically, we …

Robust Features in Deep Neural Networks for Transcoded Speech Recognition DSR and AMR-NB

L Bouchakour, M Debyeche… - 2024 8th International …, 2024 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR) performance in mobile communications degrades
significantly if the environment includes many sources of variability, such as when the test …

[PDF][PDF] Time-Frequency Coherence for Periodic-Aperiodic Decomposition of Speech Signals.

K Vijayan, JK Dhiman, CS Seelamantula - INTERSPEECH, 2017 - researchgate.net
Decomposing speech signals into periodic and aperiodic components is an important task,
finding applications in speech synthesis, coding, denoising, etc. In this paper, we construct a …