Pkwrap: a pytorch package for lf-mmi training of acoustic models

J Zuluaga-Gomez, A Prasad, I Nigmatulina, P Motlicek… - Aerospace, 2023 - mdpi.com

In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic
controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI) …

被引用次数：12 相关文章所有 14 个版本

[PDF] arxiv.org

Automatic speech recognition benchmark for air-traffic communications

J Zuluaga-Gomez, P Motlicek, Q Zhan, K Vesely… - arXiv preprint arXiv …, 2020 - arxiv.org

Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas
of speech-based automation such as in Air-Traffic Control (ATC) environment. Currently …

被引用次数：36 相关文章所有 13 个版本

[PDF] arxiv.org

Are disentangled representations all you need to build speaker anonymization systems?

P Champion, D Jouvet, A Larcher - arXiv preprint arXiv:2208.10497, 2022 - arxiv.org

Speech signals contain a lot of sensitive information, such as the speaker's identity, which
raises privacy concerns when speech data get collected. Speaker anonymization aims to …

被引用次数：14 相关文章所有 18 个版本

[PDF] arxiv.org

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

A Vyas, S Madikeri, H Bourlard - arXiv preprint arXiv:2104.02558, 2021 - arxiv.org

In this work, we investigate if the wav2vec 2.0 self-supervised pretraining helps mitigate the
overfitting issues with connectionist temporal classification (CTC) training to reduce its …

被引用次数：19 相关文章所有 10 个版本

[PDF] arxiv.org

Lattice-free MMI adaptation of self-supervised pretrained acoustic models

A Vyas, S Madikeri, H Bourlard - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-
supervised pretrained acoustic model. We pretrain a Transformer model on thousand hours …

被引用次数：14 相关文章所有 4 个版本

[PDF] idiap.ch

Parameter-Efficient Tuning with Adaptive Bottlenecks for Automatic Speech Recognition

G Vanderreydt, A Prasad, D Khalil… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org

Transfer learning from large multilingual pretrained models, like XLSR, has become the new
paradigm for Automatic Speech Recognition (ASR). Considering their ever-increasing size …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Anonymizing speech: Evaluating and designing speaker anonymization techniques

P Champion - arXiv preprint arXiv:2308.04455, 2023 - arxiv.org

The growing use of voice user interfaces has led to a surge in the collection and storage of
speech data. While data collection allows for the development of efficient tools powering …

被引用次数：4 相关文章所有 15 个版本

[PDF] idiap.ch

Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint

A Prasad, A Carofilis, G Vanderreydt… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Self-supervised models trained with high linguistic diversity, such as the XLS-R model, can
be effectively fine-tuned for the language recognition task. Typically, a back-end classifier …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Effectiveness of text, acoustic, and lattice-based representations in spoken language understanding tasks

E Villatoro-Tello, S Madikeri… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

In this paper, we perform an exhaustive evaluation of different representations to address
the intent classification problem in a Spoken Language Understanding (SLU) setup. We …

被引用次数：3 相关文章所有 9 个版本

[PDF] isca-archive.org

[PDF][PDF] Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages.

SR Madikeri, P Motlicek, H Bourlard - Interspeech, 2021 - isca-archive.org

In this paper, we develop Automatic Speech Recognition (ASR) systems for multi-genre
speech recognition of low-resource languages where training data is predominantly …

被引用次数：6 相关文章所有 7 个版本