ASR-based speech intelligibility prediction: A review

M Karbasi, D Kolossa - Hearing Research, 2022 - Elsevier
Various types of methods and approaches are available to predict the intelligibility of speech
signals, but many of these still suffer from two major problems: first, their required prior …

[HTML][HTML] An overview of FIR filter design in future multicarrier communication systems

L Jiang, H Zhang, S Cheng, H Lv, P Li - Electronics, 2020 - mdpi.com
Future wireless communication systems are facing with many challenges due to their
complexity and diversification. Orthogonal frequency division multiplexing (OFDM) in 4G …

Speech emotion recognition using 3d convolutions and attention-based sliding recurrent networks with auditory front-ends

Z Peng, X Li, Z Zhu, M Unoki, J Dang, M Akagi - IEEE Access, 2020 - ieeexplore.ieee.org
Emotion information from speech can effectively help robots understand speaker's intentions
in natural human-robot interaction. The human auditory system can easily track temporal …

Multi-resolution modulation-filtered cochleagram feature for LSTM-based dimensional emotion recognition from speech

Z Peng, J Dang, M Unoki, M Akagi - Neural Networks, 2021 - Elsevier
Continuous dimensional emotion recognition from speech helps robots or virtual agents
capture the temporal dynamics of a speaker's emotional state in natural human–robot …

The hearing-aid speech perception index (HASPI) version 2

JM Kates, KH Arehart - Speech Communication, 2021 - Elsevier
This paper presents a revised version of the Hearing-Aid Speech Perception Index (HASPI).
The index is based on a model of the auditory periphery that incorporates changes due to …

Predicting speech intelligibility with deep neural networks

C Spille, SD Ewert, B Kollmeier, BT Meyer - Computer Speech & Language, 2018 - Elsevier
An accurate objective prediction of human speech intelligibility is of interest for many
applications such as the evaluation of signal processing algorithms. To predict the speech …

Joint estimation of reverberation time and early-to-late reverberation ratio from single-channel speech signals

F Xiong, S Goetze, B Kollmeier… - IEEE/ACM Transactions …, 2018 - ieeexplore.ieee.org
The reverberation time (RT) and the early-to-late reverberation ratio (ELR) are two key
parameters commonly used to characterize acoustic room environments. In contrast to …

[HTML][HTML] Deep neural network model of hearing-impaired speech-in-noise perception

S Haro, CJ Smalt, GA Ciccarelli… - Frontiers in Neuroscience, 2020 - frontiersin.org
Many individuals struggle to understand speech in listening scenarios that include
reverberation and background noise. An individual's ability to understand speech arises …

[HTML][HTML] A model of speech recognition for hearing-impaired listeners based on deep learning

J Roßbach, B Kollmeier, BT Meyer - … Journal of the Acoustical Society of …, 2022 - pubs.aip.org
Automatic speech recognition (ASR) has made major progress based on deep machine
learning, which motivated the use of deep neural networks (DNNs) as perception models …

Prediction of speech intelligibility with DNN-based performance measures

AMC Martinez, C Spille, J Roßbach, B Kollmeier… - Computer Speech & …, 2022 - Elsevier
This paper presents a speech intelligibility model based on automatic speech recognition
(ASR), combining phoneme probabilities from deep neural networks (DNN) and a …