Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase...

A Koenecke, A Nam, E Lake, J Nudell… - Proceedings of the …, 2020 - National Acad Sciences

Automated speech recognition (ASR) systems, which use sophisticated machine-learning
algorithms to convert spoken language to text, have become increasingly widespread …

被引用次数：741 相关文章所有 17 个版本

[PDF] ed.ac.uk

The listening talker: A review of human and algorithmic context-induced modifications of speech

M Cooke, S King, M Garnier, V Aubanel - Computer Speech & Language, 2014 - Elsevier

Speech output technology is finding widespread application, including in scenarios where
intelligibility might be compromised–at least for some listeners–by adverse conditions …

被引用次数：157 相关文章所有 15 个版本

[PDF] aclanthology.org

Gender and dialect bias in YouTube's automatic captions

R Tatman - Proceedings of the first ACL workshop on ethics in …, 2017 - aclanthology.org

This project evaluates the accuracy of YouTube's automatically-generated captions across
two genders and five dialect groups. Speakers' dialect and gender was controlled for by …

被引用次数：498 相关文章所有 6 个版本

[PDF] arxiv.org

Quantifying bias in automatic speech recognition

S Feng, O Kudina, BM Halpern… - arXiv preprint arXiv …, 2021 - arxiv.org

Automatic speech recognition (ASR) systems promise to deliver objective interpretation of
human speech. Practice and recent evidence suggests that the state-of-the-art (SotA) ASRs …

被引用次数：143 相关文章所有 3 个版本

[PDF] academia.edu

[图书][B] Teaching and researching: Listening

M Rost - 2013 - taylorfrancis.com

Teaching and Researching Listening provides a focused, state-of-the-art treatment of the
linguistic, psycholinguistic and pragmatic processes that are involved in oral language use …

被引用次数：3740 相关文章所有 14 个版本

[PDF] bris.ac.uk

Deconstructing comprehensibility: Identifying the linguistic influences on listeners' L2 comprehensibility ratings

T Isaacs, P Trofimovich - Studies in Second Language Acquisition, 2012 - cambridge.org

Comprehensibility, a major concept in second language (L2) pronunciation research that
denotes listeners' perceptions of how easily they understand L2 speech, is central to …

被引用次数：426 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] Towards inclusive automatic speech recognition

S Feng, BM Halpern, O Kudina… - Computer Speech & …, 2024 - Elsevier

Practice and recent evidence show that state-of-the-art (SotA) automatic speech recognition
(ASR) systems do not perform equally well for all speaker groups. Many factors can cause …

被引用次数：53 相关文章所有 7 个版本

[PDF] google.com

[PDF][PDF] Effects of Talker Dialect, Gender & Race on Accuracy of Bing Speech and YouTube Automatic Captions.

R Tatman, C Kasten - Interspeech, 2017 - drive.google.com

This project compares the accuracy of two automatic speech recognition (ASR) systems–
Bing Speech and YouTube's automatic captions–across gender, race and four dialects of …

被引用次数：177 相关文章所有 4 个版本

[PDF] aclanthology.org

[PDF][PDF] Lexicon-free conversational speech recognition with neural networks

A Maas, Z Xie, D Jurafsky, AY Ng - … of the 2015 Conference of the …, 2015 - aclanthology.org

We present an approach to speech recognition that uses only a neural network to map
acoustic input to characters, a character-level language model, and a beam search …

被引用次数：194 相关文章所有 15 个版本

Understanding automatic speech recognition

D O'Shaughnessy - Computer Speech & Language, 2023 - Elsevier

This paper discusses how automatic speech recognition systems are and could be
designed, in order to best exploit the discriminative information encoded in human speech …

被引用次数：5 相关文章