Automatic Pronunciation Assessment--A Review

YE Kheir, A Ali, SA Chowdhury - arXiv preprint arXiv:2310.13974, 2023 - arxiv.org
Pronunciation assessment and its application in computer-aided pronunciation training
(CAPT) have seen impressive progress in recent years. With the rapid growth in language …

Improving non-native word-level pronunciation scoring with phone-level mixup data augmentation and multi-source information

K Fu, S Gao, K Wang, W Li, X Tian, Z Ma - arXiv preprint arXiv:2203.01826, 2022 - arxiv.org
Deep learning-based pronunciation scoring models highly rely on the availability of the
annotated non-native data, which is costly and has scalability issues. To deal with the data …

[PDF][PDF] Distilling knowledge from Gaussian process teacher to neural network student

JHM Wong, H Zhang, NF Chen - Interspeech, 2023 - isca-archive.org
Abstract Neural Networks (NN) and Gaussian Processes (GP) are different modelling
approaches. The former stores characteristics of the training data in its many parameters …

[PDF][PDF] Using Fluency Representation Learned from Sequential Raw Features for Improving Non-native Fluency Scoring.

K Fu, S Gao, X Tian, W Li, Z Ma, A Bytedance - INTERSPEECH, 2022 - researchgate.net
Automatic non-native fluency scoring is a challenging task which relies heavily on the
effectiveness of the handcrafted fluency features used for predicting fluency scores. In this …

An ASR-free fluency scoring approach with self-supervised learning

W Liu, K Fu, X Tian, S Shi, W Li, Z Ma… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
A typical fluency scoring system generally relies on an automatic speech recognition (ASR)
system to obtain time stamps in input speech for the subsequent calculation of fluency …

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

W Liu, K Fu, X Tian, S Shi, W Li, Z Ma… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Recent studies on pronunciation scoring have explored the effect of introducing phone
embeddings as reference pronunciation, but mostly in an implicit manner, ie, addition or …

[PDF][PDF] Variations of multi-task learning for spoken language assessment.

JHM Wong, H Zhang, NF Chen - INTERSPEECH, 2022 - isca-archive.org
Automatic spoken language assessment often operates within a regime whereby only a
limited quantity of training data is available. In other low-resourced tasks, such as in speech …

Multi-lingual pronunciation assessment with unified phoneme set and language-specific embeddings

B Lin, L Wang - … 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
Automatic pronunciation assessment is commonly trained and applied for a specific
language, which is not practical in multi-lingual or low-resource scenarios. In this paper, we …

[PDF][PDF] A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning

H Ryu, S Kim, M Chung - INTERSPEECH, 2023 - isca-archive.org
Empirical studies report a strong correlation between pronunciation proficiency scores and
phonetic errors in non-native speech assessments of human evaluators. However, the …

Automatic Fluency Assessment Method for Spontaneous Speech without Reference Text

J Liu, A Wumaier, C Fan, S Guo - Electronics, 2023 - mdpi.com
The automatic fluency assessment of spontaneous speech without reference text is a
challenging task that heavily depends on the accuracy of automatic speech recognition …