Elastic spectral distortion for low resource speech recognition with deep neural networks

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org

Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

被引用次数：1185 相关文章所有 9 个版本

[PDF] arxiv.org

Recent progress in the CUHK dysarthric speech recognition system

S Liu, M Geng, S Hu, X Xie, M Cui, J Yu… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org

Despite the rapid progress of automatic speech recognition (ASR) technologies in the past
few decades, recognition of disordered speech remains a highly challenging task to date …

被引用次数：54 相关文章所有 8 个版本

[PDF] thecvf.com

Randaugment: Practical automated data augmentation with a reduced search space

ED Cubuk, B Zoph, J Shlens… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com

Recent work on automated augmentation strategies has led to state-of-the-art results in
image classification and object detection. An obstacle to a large-scale adoption of these …

被引用次数：3392 相关文章所有 12 个版本

[PDF] arxiv.org

Specaugment: A simple data augmentation method for automatic speech recognition

DS Park, W Chan, Y Zhang, CC Chiu, B Zoph… - arXiv preprint arXiv …, 2019 - arxiv.org

We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …

被引用次数：3890 相关文章所有 8 个版本

[PDF] ieee.org

End-to-end speech recognition: A survey

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

被引用次数：80 相关文章所有 6 个版本

[PDF] arxiv.org

Deep learning for audio signal processing

H Purwins, B Li, T Virtanen, J Schlüter… - IEEE Journal of …, 2019 - ieeexplore.ieee.org

Given the recent surge in developments of deep learning, this paper provides a review of the
state-of-the-art deep learning techniques for audio signal processing. Speech, music, and …

被引用次数：800 相关文章所有 7 个版本

[PDF] isca-archive.org

[PDF][PDF] Audio augmentation for speech recognition.

T Ko, V Peddinti, D Povey, S Khudanpur - Interspeech, 2015 - isca-archive.org

Data augmentation is a common strategy adopted to increase the quantity of training data,
avoid overfitting and improve robustness of the models. In this paper, we investigate audio …

被引用次数：1352 相关文章所有 10 个版本

[PDF] hal.science

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

E Vincent, S Watanabe, AA Nugraha, J Barker… - Computer Speech & …, 2017 - Elsevier

Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …

被引用次数：410 相关文章所有 16 个版本

Data augmentation for deep neural network acoustic modeling

X Cui, V Goel, B Kingsbury - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org

This paper investigates data augmentation for deep neural network acoustic modeling
based on label-preserving transformations to deal with data sparsity. Two data …

被引用次数：529 相关文章所有 11 个版本

[PDF] arxiv.org

Data augmenting contrastive learning of speech representations in the time domain

E Kharitonov, M Rivière, G Synnaeve… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org

Contrastive Predictive Coding (CPC), based on predicting future segments of speech from
past segments is emerging as a powerful algorithm for representation learning of speech …

被引用次数：119 相关文章所有 7 个版本