On the limit of english conversational speech recognition

R Prabhavalkar, T Hori, TN Sainath… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …

被引用次数：112 相关文章所有 6 个版本

[PDF] arxiv.org

Bigssl: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition

Y Zhang, DS Park, W Han, J Qin… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

We summarize the results of a host of efforts using giant automatic speech recognition (ASR)
models pre-trained using large, diverse unlabeled datasets containing approximately a …

被引用次数：175 相关文章所有 4 个版本

[HTML] nih.gov

An accurate and rapidly calibrating speech neuroprosthesis

NS Card, M Wairagkar, C Iacobacci… - … England Journal of …, 2024 - Mass Medical Soc

Background Brain–computer interfaces can enable communication for people with paralysis
by transforming cortical activity associated with attempted speech into text on a computer …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Diagonal state space augmented transformers for speech recognition

G Saon, A Gupta, X Cui - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

We improve on the popular conformer architecture by replacing the depthwise temporal
convolutions with diagonal state space (DSS) models. DSS is a recently introduced variant …

被引用次数：25 相关文章所有 4 个版本

[PDF] arxiv.org

Speaker adaptation using spectro-temporal deep features for dysarthric and elderly speech recognition

M Geng, X Xie, Z Ye, T Wang, G Li, S Hu… - … on Audio, Speech …, 2022 - ieeexplore.ieee.org

Despite the rapid progress of automatic speech recognition (ASR) technologies targeting
normal speech in recent decades, accurate recognition of dysarthric and elderly speech …

被引用次数：29 相关文章所有 7 个版本

[PDF] arxiv.org

VarArray: Array-geometry-agnostic continuous speech separation

T Yoshioka, X Wang, D Wang, M Tang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Continuous speech separation using a microphone array was shown to be promising in
dealing with the speech overlap problem in natural conversation transcription. This paper …

被引用次数：29 相关文章所有 4 个版本

[PDF] arxiv.org

Bayesian neural network language modeling for speech recognition

B Xue, S Hu, J Xu, M Geng, X Liu… - IEEE/ACM Transactions …, 2022 - ieeexplore.ieee.org

State-of-the-art neural network language models (NNLMs) represented by long short term
memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly …

被引用次数：18 相关文章所有 5 个版本

[PDF] arxiv.org

Modular domain adaptation for conformer-based streaming asr

Q Li, B Li, D Hwang, TN Sainath… - arXiv preprint arXiv …, 2023 - arxiv.org

Speech data from different domains has distinct acoustic and linguistic characteristics. It is
common to train a single multidomain model such as a Conformer transducer for speech …

被引用次数：12 相关文章所有 7 个版本

[PDF] arxiv.org

Confidence score based speaker adaptation of conformer speech recognition systems

J Deng, X Xie, T Wang, M Cui, B Xue… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

Speaker adaptation techniques provide a powerful solution to customise automatic speech
recognition (ASR) systems for individual users. Practical application of unsupervised model …

被引用次数：8 相关文章所有 5 个版本

[PDF] arxiv.org

Efficient training of neural transducer for speech recognition

W Zhou, W Michel, R Schlüter, H Ney - arXiv preprint arXiv:2204.10586, 2022 - arxiv.org

As one of the most popular sequence-to-sequence modeling approaches for speech
recognition, the RNN-Transducer has achieved evolving performance with more and more …

被引用次数：19 相关文章所有 9 个版本