Role of linear, mel and inverse-mel filterbanks in automatic recognition of speech from high-pitc...

[HTML][HTML] A formant modification method for improved ASR of children's speech

HK Kathania, SR Kadiri, P Alku, M Kurimo - Speech Communication, 2022 - Elsevier

Differences in acoustic characteristics between children's and adults' speech degrade
performance of automatic speech recognition systems when systems trained using adults' …

被引用次数：25 相关文章所有 7 个版本

Spectro-temporal representation of speech for intelligibility assessment of dysarthria

HM Chandrashekar, V Karjigi… - IEEE Journal of Selected …, 2019 - ieeexplore.ieee.org

Recently, spectro-temporal representation of speech has been used in many fields of
speech processing. Owing to this, we explore the use of spectro-temporal representation for …

被引用次数：51 相关文章所有 2 个版本

[PDF] ssrn.com

Effective preservation of higher-frequency contents in the context of short utterance based children's speaker verification system

S Aziz, S Shahnawazuddin - Applied Acoustics, 2023 - Elsevier

Developing an automatic speaker verification (ASV) system for children is extremely
challenging due to the unavailability of children's speech corpora. The challenges are …

被引用次数：8 相关文章所有 2 个版本

[PDF] arxiv.org

Multimodal urban sound tagging with spatiotemporal context

J Bai, J Chen, M Wang - IEEE Transactions on Cognitive and …, 2022 - ieeexplore.ieee.org

Noise pollution significantly affects our daily life and urban development. Urban sound
tagging (UST) has attracted much attention recently, which aims to analyze and monitor …

被引用次数：13 相关文章所有 3 个版本

[PDF] mdpi.com

Using data augmentation and time-scale modification to improve asr of children's speech in noisy environments

HK Kathania, SR Kadiri, P Alku, M Kurimo - Applied Sciences, 2021 - mdpi.com

Current ASR systems show poor performance in recognition of children's speech in noisy
environments because recognizers are typically trained with clean adults' speech and …

被引用次数：10 相关文章所有 10 个版本

[PDF] ieee.org

Classification of phonation modes in classical singing using modulation power spectral features

M Brandner, PA Bereuter, SR Kadiri… - IEEE Access, 2023 - ieeexplore.ieee.org

In singing, the perceptual term “voice quality” is used to describe expressed emotions and
singing styles. In voice physiology research, specific voice qualities are discussed using the …

被引用次数：5 相关文章所有 4 个版本

SDFIE-NET–A self-learning dual-feature fusion information capture expression method for birdsong recognition

Q Zhang, S Hu, L Tang, R Deng, C Yang, G Zhou… - Applied Acoustics, 2024 - Elsevier

Bird recognition is important for the monitoring of bird populations and the protection of
ecosystems. Identifying birds through image forms can be difficult due to the complexity of …

被引用次数：2 相关文章

Experimental studies for improving the performance of children's speaker verification system using short utterances

S Aziz, S Shahnawazuddin - Applied Acoustics, 2024 - Elsevier

The task of developing an automatic speaker verification (ASV) system for children's speech
is a formidable one due to a number of reasons. The dearth of domain-specific data is one …

被引用次数：2 相关文章

Improving the performance of asr system by building acoustic models using spectro-temporal and phase-based features

A Dutta, G Ashishkumar, CVR Rao - Circuits, Systems, and Signal …, 2022 - Springer

State-of-the-art spectral or temporal features of speech do not provide adequate attributes
for automatic speech recognition (ASR) system in noisy environments. Recently, phase …

被引用次数：4 相关文章所有 4 个版本

Role of Data Augmentation and Effective Conservation of High-Frequency Contents in the Context Children's Speaker Verification System

S Aziz, S Shahnawazuddin - Circuits, Systems, and Signal Processing, 2024 - Springer

Developing an automatic speaker verification (ASV) system for children's speech presents
significant challenges. One major obstacle is the scarcity of domain-specific data. This issue …

被引用次数：1 相关文章所有 4 个版本