The influence of dataset partitioning on dysfluency detection systems
This paper empirically investigates the influence of different data splits and splitting
strategies on the performance of dysfluency detection systems. For this, we perform …
strategies on the performance of dysfluency detection systems. For this, we perform …
Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments
In multi-lingual societies, where multiple languages are spoken in a small geographic
vicinity, informal conversations often involve mix of languages. Existing speech technologies …
vicinity, informal conversations often involve mix of languages. Existing speech technologies …
Hats: An open data set integrating human perception applied to the evaluation of automatic speech recognition metrics
Abstract Conventionally, Automatic Speech Recognition (ASR) systems are evaluated on
their ability to correctly recognize each word contained in a speech signal. In this context, the …
their ability to correctly recognize each word contained in a speech signal. In this context, the …
Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples
Recent works on automatic speech recognition (ASR) systems have shown that the
underlying neural networks are vulnerable to so-called adversarial examples. In order to …
underlying neural networks are vulnerable to so-called adversarial examples. In order to …
CRDNN-BiLSTM Knowledge Distillation Model Towards Enhancing the Automatic Speech Recognition
L Ashok Kumar, D Karthika Renuka, KS Naveena… - SN Computer …, 2024 - Springer
Numerous automatic speech recognition (ASR) models have been developed in recent
years, but they suffer from the drawback of being large models that take more time to train …
years, but they suffer from the drawback of being large models that take more time to train …
Discovering Authentic Self: Coaching Agent for Job-Hunting Students
E Hashimoto, K Nagira, T Mizumoto… - … Conference on Human …, 2024 - Springer
This study addresses the social issue in Japan where the unique recruitment system often
leads to a mismatch between students' vague career goals and job requirements, impacting …
leads to a mismatch between students' vague career goals and job requirements, impacting …
On the Impact of FFP2 Face Masks on Speaker Verification for Mobile Device Authentication
D Sedlak, RD Findling - International Conference on Advances in Mobile …, 2023 - Springer
Voice-based authentication can allow for straightforward and unobtrusive authentication
with mobile devices. With COVID-19, wearing face masks has become common in many …
with mobile devices. With COVID-19, wearing face masks has become common in many …
Disentanglement Learning for Text-Free Voice Conversion
M Chen - 2023 - etheses.whiterose.ac.uk
Voice conversion (VC) aims to change the perceived speaker identity of a speech signal
from one to another, while preserving the linguistic content. Recent state-of-the-art VC …
from one to another, while preserving the linguistic content. Recent state-of-the-art VC …
Neural Speech Processing for Whale Call Detection
E Fourie, MH Davel, J Versfeld - Southern African Conference for Artificial …, 2022 - Springer
Passive acoustic monitoring with hydrophones makes it possible to detect the presence of
marine animals over large areas. For monitoring to be cost-effective, this process should be …
marine animals over large areas. For monitoring to be cost-effective, this process should be …
A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition
The evaluation of automatic speech transcriptions relies heavily on metrics such as Word
Error Rate (WER) and Character Error Rate (CER). However, these metrics have faced …
Error Rate (WER) and Character Error Rate (CER). However, these metrics have faced …