SpeechBrain: a general-purpose speech toolkit (2021)

SP Bayerl, D Wagner, E Nöth, T Bocklet… - … Conference on Text …, 2022 - Springer

This paper empirically investigates the influence of different data splits and splitting
strategies on the performance of dysfluency detection systems. For this, we perform …

被引用次数：19 相关文章所有 7 个版本

[PDF] arxiv.org

Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments

S Baghel, S Ramoji, S Jain, PR Chowdhuri… - Speech …, 2024 - Elsevier

In multi-lingual societies, where multiple languages are spoken in a small geographic
vicinity, informal conversations often involve mix of languages. Existing speech technologies …

被引用次数：5 相关文章所有 4 个版本

[PDF] hal.science

Hats: An open data set integrating human perception applied to the evaluation of automatic speech recognition metrics

T Bañeras-Roux, J Wottawa, M Rouvier… - … Conference on Text …, 2023 - Springer

Abstract Conventionally, Automatic Speech Recognition (ASR) systems are evaluated on
their ability to correctly recognize each word contained in a speech signal. In this context, the …

被引用次数：3 相关文章所有 8 个版本

Comparing Unsupervised Detection Algorithms for Audio Adversarial Examples

S Choosaksakunwiboon, K Pizzi, CY Kao - International Conference on …, 2022 - Springer

Recent works on automatic speech recognition (ASR) systems have shown that the
underlying neural networks are vulnerable to so-called adversarial examples. In order to …

被引用次数：1 相关文章所有 4 个版本

CRDNN-BiLSTM Knowledge Distillation Model Towards Enhancing the Automatic Speech Recognition

L Ashok Kumar, D Karthika Renuka, KS Naveena… - SN Computer …, 2024 - Springer

Numerous automatic speech recognition (ASR) models have been developed in recent
years, but they suffer from the drawback of being large models that take more time to train …

相关文章所有 2 个版本

Discovering Authentic Self: Coaching Agent for Job-Hunting Students

E Hashimoto, K Nagira, T Mizumoto… - … Conference on Human …, 2024 - Springer

This study addresses the social issue in Japan where the unique recruitment system often
leads to a mismatch between students' vague career goals and job requirements, impacting …

On the Impact of FFP2 Face Masks on Speaker Verification for Mobile Device Authentication

D Sedlak, RD Findling - International Conference on Advances in Mobile …, 2023 - Springer

Voice-based authentication can allow for straightforward and unobtrusive authentication
with mobile devices. With COVID-19, wearing face masks has become common in many …

相关文章所有 2 个版本

[PDF] whiterose.ac.uk

Disentanglement Learning for Text-Free Voice Conversion

M Chen - 2023 - etheses.whiterose.ac.uk

Voice conversion (VC) aims to change the perceived speaker identity of a speech signal
from one to another, while preserving the linguistic content. Recent state-of-the-art VC …

相关文章所有 2 个版本

Neural Speech Processing for Whale Call Detection

E Fourie, MH Davel, J Versfeld - Southern African Conference for Artificial …, 2022 - Springer

Passive acoustic monitoring with hydrophones makes it possible to detect the presence of
marine animals over large areas. For monitoring to be cost-effective, this process should be …

相关文章所有 2 个版本

[PDF] hal.science

A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition

TB Roux, M Rouvier, J Wottawa, R Dufour - Text, Speech and Dialogue, 2024 - hal.science

The evaluation of automatic speech transcriptions relies heavily on metrics such as Word
Error Rate (WER) and Character Error Rate (CER). However, these metrics have faced …