[HTML][HTML] A virtual simulation-pilot agent for training of air traffic controllers

J Zuluaga-Gomez, A Prasad, I Nigmatulina, P Motlicek… - Aerospace, 2023 - mdpi.com
In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …

Automatic speech recognition benchmark for air-traffic communications

J Zuluaga-Gomez, P Motlicek, Q Zhan, K Vesely… - arXiv preprint arXiv …, 2020 - arxiv.org
Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas
of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently …

Are disentangled representations all you need to build speaker anonymization systems?

P Champion, D Jouvet, A Larcher - arXiv preprint arXiv:2208.10497, 2022 - arxiv.org
Speech signals contain a lot of sensitive information, such as the speaker's identity, which
raises privacy concerns when speech data get collected. Speaker anonymization aims to …

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

A Vyas, S Madikeri, H Bourlard - arXiv preprint arXiv:2104.02558, 2021 - arxiv.org
In this work, we investigate if the wav2vec 2.0 self-supervised pretraining helps mitigate the
overfitting issues with connectionist temporal classification (CTC) training to reduce its …

Lattice-free MMI adaptation of self-supervised pretrained acoustic models

A Vyas, S Madikeri, H Bourlard - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-
supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours …

Parameter-Efficient Tuning with Adaptive Bottlenecks for Automatic Speech Recognition

G Vanderreydt, A Prasad, D Khalil… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Transfer learning from large multilingual pretrained models, like XLSR, has become the new
paradigm for Automatic Speech Recognition (ASR). Considering their ever-increasing size …

Anonymizing speech: Evaluating and designing speaker anonymization techniques

P Champion - arXiv preprint arXiv:2308.04455, 2023 - arxiv.org
The growing use of voice user interfaces has led to a surge in the collection and storage of
speech data. While data collection allows for the development of efficient tools powering …

Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint

A Prasad, A Carofilis, G Vanderreydt… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Self-supervised models trained with high linguistic diversity, such as the XLS-R model, can
be effectively fine-tuned for the language recognition task. Typically, a back-end classifier …

Effectiveness of text, acoustic, and lattice-based representations in spoken language understanding tasks

E Villatoro-Tello, S Madikeri… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this paper, we perform an exhaustive evaluation of different representations to address
the intent classification problem in a Spoken Language Understanding (SLU) setup. We …

[PDF][PDF] Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages.

SR Madikeri, P Motlicek, H Bourlard - Interspeech, 2021 - isca-archive.org
In this paper, we develop Automatic Speech Recognition (ASR) systems for multi-genre
speech recognition of low-resource languages where training data is predominantly …