SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - arXiv preprint arXiv …, 2021 - arxiv.org
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …

A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding

Y Wang, A Boumadane, A Heba - arXiv preprint arXiv:2111.02735, 2021 - arxiv.org
Speech self-supervised models such as wav2vec 2.0 and HuBERT are making revolutionary
progress in Automatic Speech Recognition (ASR). However, they have not been totally …

SLUE phase-2: A benchmark suite of diverse spoken language understanding tasks

S Shon, S Arora, CJ Lin, A Pasad, F Wu… - arXiv preprint arXiv …, 2022 - arxiv.org
Spoken language understanding (SLU) tasks have been studied for many decades in the
speech research community, but have not received as much attention as lower-level tasks …

Match to win: Analysing sequences lengths for efficient self-supervised learning in speech and audio

Y Gaol, J Fernandez-Marques… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
Self-supervised learning (SSL) has proven vital in speech and audio-related applications.
The paradigm trains a general model on unlabeled data that can later be used to solve …

Finstreder: simple and fast spoken language understanding with finite state transducers using modern speech-to-text models

D Bermuth, A Poeppel, W Reif - arXiv preprint arXiv:2206.14589, 2022 - arxiv.org
In Spoken Language Understanding (SLU) the task is to extract important information from
audio commands, like the intent of what a user wants the system to do and special entities …

TARIC-SLU: A Tunisian Benchmark Dataset For Spoken Language Understanding

S Mdhaffar, F Bougares, R De Mori… - Proceedings of the …, 2024 - aclanthology.org
In recent years, there has been a significant increase in interest in developing Spoken
Language Understanding (SLU) systems. SLU involves extracting a list of semantic …

MSNER: A Multilingual Speech Dataset for Named Entity Recognition

Q Meeus, MF Moens - arXiv preprint arXiv:2405.11519, 2024 - arxiv.org
While extensively explored in text-based tasks, Named Entity Recognition (NER) remains
largely neglected in spoken language understanding. Existing resources are limited to a …

Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language Understanding

S Kim, J Hwang, HY Jung - arXiv preprint arXiv:2405.15097, 2024 - arxiv.org
Recently, deep end-to-end learning has been studied for intent classification in Spoken
Language Understanding (SLU). However, end-to-end models require a large amount of …

Digits micro-model for accurate and secure transactions

C Chhablani, N Sharma, J Hosier… - arXiv preprint arXiv …, 2024 - arxiv.org
Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance
the caller experience by enabling natural language understanding and facilitating efficient …

Deep neural networks for voice control

L Lugosch - 2023 - escholarship.mcgill.ca
Voice control systems enable people to control their computers by speaking to them. After a
review of the state-of-the-art in sequence modeling, speech recognition, and language …