Automatic Pronunciation Assessment--A Review
Pronunciation assessment and its application in computer-aided pronunciation training
(CAPT) have seen impressive progress in recent years. With the rapid growth in language …
(CAPT) have seen impressive progress in recent years. With the rapid growth in language …
Improving non-native word-level pronunciation scoring with phone-level mixup data augmentation and multi-source information
Deep learning-based pronunciation scoring models highly rely on the availability of the
annotated non-native data, which is costly and has scalability issues. To deal with the data …
annotated non-native data, which is costly and has scalability issues. To deal with the data …
[PDF][PDF] Distilling knowledge from Gaussian process teacher to neural network student
Abstract Neural Networks (NN) and Gaussian Processes (GP) are different modelling
approaches. The former stores characteristics of the training data in its many parameters …
approaches. The former stores characteristics of the training data in its many parameters …
[PDF][PDF] Using Fluency Representation Learned from Sequential Raw Features for Improving Non-native Fluency Scoring.
Automatic non-native fluency scoring is a challenging task which relies heavily on the
effectiveness of the handcrafted fluency features used for predicting fluency scores. In this …
effectiveness of the handcrafted fluency features used for predicting fluency scores. In this …
An ASR-free fluency scoring approach with self-supervised learning
A typical fluency scoring system generally relies on an automatic speech recognition (ASR)
system to obtain time stamps in input speech for the subsequent calculation of fluency …
system to obtain time stamps in input speech for the subsequent calculation of fluency …
Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring
Recent studies on pronunciation scoring have explored the effect of introducing phone
embeddings as reference pronunciation, but mostly in an implicit manner, ie, addition or …
embeddings as reference pronunciation, but mostly in an implicit manner, ie, addition or …
[PDF][PDF] Variations of multi-task learning for spoken language assessment.
Automatic spoken language assessment often operates within a regime whereby only a
limited quantity of training data is available. In other low-resourced tasks, such as in speech …
limited quantity of training data is available. In other low-resourced tasks, such as in speech …
Multi-lingual pronunciation assessment with unified phoneme set and language-specific embeddings
B Lin, L Wang - … 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
Automatic pronunciation assessment is commonly trained and applied for a specific
language, which is not practical in multi-lingual or low-resource scenarios. In this paper, we …
language, which is not practical in multi-lingual or low-resource scenarios. In this paper, we …
[PDF][PDF] A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning
H Ryu, S Kim, M Chung - INTERSPEECH, 2023 - isca-archive.org
Empirical studies report a strong correlation between pronunciation proficiency scores and
phonetic errors in non-native speech assessments of human evaluators. However, the …
phonetic errors in non-native speech assessments of human evaluators. However, the …
Automatic Fluency Assessment Method for Spontaneous Speech without Reference Text
J Liu, A Wumaier, C Fan, S Guo - Electronics, 2023 - mdpi.com
The automatic fluency assessment of spontaneous speech without reference text is a
challenging task that heavily depends on the accuracy of automatic speech recognition …
challenging task that heavily depends on the accuracy of automatic speech recognition …