How might we create better benchmarks for speech recognition?
A Aksënova, D van Esch, J Flynn… - Proceedings of the 1st …, 2021 - aclanthology.org
The applications of automatic speech recognition (ASR) systems are proliferating, in part
due to recent significant quality improvements. However, as recent work indicates, even …
due to recent significant quality improvements. However, as recent work indicates, even …
Automatic disfluency detection from untranscribed speech
A Romana, K Koishida… - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Speech disfluencies, such as filled pauses or repetitions, are disruptions in the typical flow of
speech. All speakers experience disfluencies at times, and the rate at which we produce …
speech. All speakers experience disfluencies at times, and the rate at which we produce …
Analysis and tuning of a voice assistant system for dysfluent speech
Dysfluencies and variations in speech pronunciation can severely degrade speech
recognition performance, and for many individuals with moderate-to-severe speech …
recognition performance, and for many individuals with moderate-to-severe speech …
Enhancing asr for stuttered speech with limited data using detect and pass
O Shonibare, X Tong, V Ravichandran - arXiv preprint arXiv:2202.05396, 2022 - arxiv.org
It is estimated that around 70 million people worldwide are affected by a speech disorder
called stuttering. With recent advances in Automatic Speech Recognition (ASR), voice …
called stuttering. With recent advances in Automatic Speech Recognition (ASR), voice …
[PDF][PDF] End-to-End Spontaneous Speech Recognition Using Disfluency Labeling.
Spontaneous speech often contains disfluent acoustic features such as fillers and
hesitations, which are major causes of errors during automatic speech recognition (ASR). In …
hesitations, which are major causes of errors during automatic speech recognition (ASR). In …
[PDF][PDF] Whister: Using whisper's representations for stuttering detection
V Changawala, F Rudzicz - Interspeech, 2024 - isca-archive.org
In this paper, we empirically investigate the influence of different factors on the performance
of dysfluency detection. Specifically, we examine the impact of data splits, data quality, and …
of dysfluency detection. Specifically, we examine the impact of data splits, data quality, and …
[PDF][PDF] Frame-Level Stutter Detection.
Previous studies on the detection of stuttered speech have focused on classification at the
utterance level (eg, for speech therapy applications), and on the correct insertion of stutter …
utterance level (eg, for speech therapy applications), and on the correct insertion of stutter …
[PDF][PDF] On disfluency and non-lexical sound labeling for end-to-end automatic speech recognition
Spontaneous speech contains a significant amount of disfluencies and non-lexical sounds
(eg, backchannels, filled pauses), which are often difficult to transcribe. Disfluency labeling …
(eg, backchannels, filled pauses), which are often difficult to transcribe. Disfluency labeling …
[PDF][PDF] Survey: Exploring disfluencies for speech-to-speech machine translation
R Kundu, P Jyothi, P Bhattacharyya - 2022 - cfilt.iitb.ac.in
Disfluencies that appear in the transcriptions from automatic speech recognition systems
tend to impair the performance of downstream NLP tasks like machine translation …
tend to impair the performance of downstream NLP tasks like machine translation …
Efficient Recognition and Classification of Stuttered Word from Speech Signal using Deep Learning Technique
K Murugan, NK Cherukuri… - 2022 IEEE World …, 2022 - ieeexplore.ieee.org
Fluency is a metric that assesses how well a speaker communicates with another person
while presenting the information. Stuttering is one of the fluency problems that have a …
while presenting the information. Stuttering is one of the fluency problems that have a …