Automatic Pronunciation Assessment--A Review

YE Kheir, A Ali, SA Chowdhury - arXiv preprint arXiv:2310.13974, 2023 - arxiv.org
Pronunciation assessment and its application in computer-aided pronunciation training
(CAPT) have seen impressive progress in recent years. With the rapid growth in language …

Improving mispronunciation detection using speech reconstruction

A Das, R Gutierrez-Osuna - IEEE/ACM Transactions on Audio …, 2024 - ieeexplore.ieee.org
Training related machine learning tasks simultaneously can lead to improved performance
on both tasks. Text-to-speech (TTS) and mispronunciation detection and diagnosis (MDD) …

Audio features from the Wav2Vec 2.0 embeddings for the ACM multimedia 2022 stuttering challenge

C Montacié, MJ Caraty, N Lackovic - Proceedings of the 30th ACM …, 2022 - dl.acm.org
The ACM Multimedia 2022 Stuttering Challenge is to determine the stuttering-related class
of a speech segment. There are seven stuttering-related classes and an eighth garbage …

What can an accent identifier learn? Probing phonetic and prosodic information in a wav2vec2-based accent identification model

M Yang, RCMC Shekar, O Kang… - arXiv preprint arXiv …, 2023 - arxiv.org
This study is focused on understanding and quantifying the change in phoneme and
prosody information encoded in the Self-Supervised Learning (SSL) model, brought by an …

Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children

T Ahn, Y Hong, Y Im, DH Kim, D Kang… - Clinical Linguistics & …, 2024 - Taylor & Francis
This study presents a model of automatic speech recognition (ASR) that is designed to
diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace …

An ensemble-based framework for mispronunciation detection of Arabic phonemes

SS Calık, A Kucukmanisa, ZH Kilimci - Applied Acoustics, 2023 - Elsevier
Determination of mispronunciations and ensuring feedback to users are maintained by
computer-assisted language learning (CALL) systems. In this work, we introduce an …

[PDF][PDF] A Joint Model for Pronunciation Assessment and Mispronunciation Detection and Diagnosis with Multi-task Learning

H Ryu, S Kim, M Chung - INTERSPEECH, 2023 - isca-archive.org
Empirical studies report a strong correlation between pronunciation proficiency scores and
phonetic errors in non-native speech assessments of human evaluators. However, the …

Machine Learning (ML) tools for measuring second language (L2) intelligibility

K Hirschi, O Kang - Routledge Handbook of Technological …, 2024 - taylorfrancis.com
Recent advances in Machine Learning, including Deep Neural Networks (DNNs), have
resulted in Automated Speech Recognition (ASR) systems with highly accurate transcription …

Multi-view multi-task representation learning for mispronunciation detection

YE Kheir, SA Chowdhury, A Ali - arXiv preprint arXiv:2306.01845, 2023 - arxiv.org
The disparity in phonology between learner's native (L1) and target (L2) language poses a
significant challenge for mispronunciation detection and diagnosis (MDD) systems. This …

Assessment of non-native speech intelligibility using wav2vec2-based mispronunciation detection and multi-level goodness of pronunciation transformer

RC Shekar, M Yang, K Hirschi, S Looney… - ISCA INTERSPEECH …, 2023 - par.nsf.gov
Automatic pronunciation assessment (APA) plays an important role in providing feedback for
self-directed language learners in computer-assisted pronunciation training (CAPT). Several …