Corpus phonetics
MY Liberman - Annual Review of Linguistics, 2019 - annualreviews.org
Semiautomatic analysis of digital speech collections is transforming the science of
phonetics. Convenient search and analysis of large published bodies of recordings …
phonetics. Convenient search and analysis of large published bodies of recordings …
Whisperx: Time-accurate speech transcription of long-form audio
Large-scale, weakly-supervised speech recognition models, such as Whisper, have
demonstrated impressive results on speech recognition across domains and languages …
demonstrated impressive results on speech recognition across domains and languages …
[PDF][PDF] A review: Automatic speech segmentation
Automated segmentation of speech signals has been under research for over 30 years.
Many speech processing systems require segmentation of Speech waveform into principal …
Many speech processing systems require segmentation of Speech waveform into principal …
ASR-aware end-to-end neural diarization
We present a Conformer-based end-to-end neural diarization (EEND) model that uses both
acoustic input and features derived from an automatic speech recognition (ASR) model. Two …
acoustic input and features derived from an automatic speech recognition (ASR) model. Two …
Phoneme boundary detection using deep bidirectional lstms
In this paper we investigate the automatic detection of phoneme boundaries in audio
recordings with the help of deep bidirectional LSTMs. This work is motivated by the needs of …
recordings with the help of deep bidirectional LSTMs. This work is motivated by the needs of …
Phoneme mispronunciation detection by jointly learning to align
B Lin, L Wang - … 2022-2022 IEEE International Conference on …, 2022 - ieeexplore.ieee.org
Phoneme mispronunciation detection plays an important role in Computer-Assisted
Pronunciation Training. Traditional methods either rely on phone recognition, which has the …
Pronunciation Training. Traditional methods either rely on phone recognition, which has the …
Phonetic Error Analysis Beyond Phone Error Rate
In this article, we analyse the performance of the TIMIT-based phone recognition systems
beyond the overall phone error rate (PER) metric. We consider three broad phonetic classes …
beyond the overall phone error rate (PER) metric. We consider three broad phonetic classes …
A retrieval algorithm of encrypted speech based on syllable-level perceptual hashing
S He, H Zhao - Computer Science and Information Systems, 2017 - doiserbia.nb.rs
To retrieve voice information in a fast and accurate manner over encrypted speech, this
study proposes a retrieval algorithm based on syllable-level perceptual hashing. It …
study proposes a retrieval algorithm based on syllable-level perceptual hashing. It …
The Mason-Alberta Phonetic Segmenter: a forced alignment system based on deep neural networks and interpolation
Given an orthographic transcription, forced alignment systems automatically determine
boundaries between segments in speech, facilitating the use of large corpora. In the present …
boundaries between segments in speech, facilitating the use of large corpora. In the present …
A zero-resourced indigenous language phones occurrence and durations analysis for an automatic speech recognition system
This research illustrates phone occurrence analysis for an automatic speech recognition
(ASR) model of 'Adi.''Adi'is a low-resourced endangered tribal language of Arunachal …
(ASR) model of 'Adi.''Adi'is a low-resourced endangered tribal language of Arunachal …