Exploring wav2vec 2.0 on speaker verification and language identification
Z Fan, M Li, S Zhou, B Xu - arXiv preprint arXiv:2012.06185, 2020 - arxiv.org
Wav2vec 2.0 is a recently proposed self-supervised framework for speech representation
learning. It follows a two-stage training process of pre-training and fine-tuning, and performs …
learning. It follows a two-stage training process of pre-training and fine-tuning, and performs …
An overview of Indian spoken language recognition from machine learning perspective
Automatic spoken language identification (LID) is a very important research field in the era of
multilingual voice-command-based human-computer interaction. A front-end LID module …
multilingual voice-command-based human-computer interaction. A front-end LID module …
End-to-end language diarization for bilingual code-switching speech
We propose two end-to-end neural configurations for language diarization on bilingual code-
switching speech. The first, a BLSTM-E2E architecture, includes a set of stacked …
switching speech. The first, a BLSTM-E2E architecture, includes a set of stacked …
Towards relevance and sequence modeling in language recognition
The task of automatic language identification (LID) involving multiple dialects of the same
language family in the presence of noise is a challenging problem. In these scenarios, the …
language family in the presence of noise is a challenging problem. In these scenarios, the …
Multi-domain attention fusion network for language recognition
Attention-based convolutional neural network models are increasingly adopted for language
recognition tasks. In this paper, based on the self-attention mechanism, we solve the study of …
recognition tasks. In this paper, based on the self-attention mechanism, we solve the study of …
Improving language identification for multilingual speakers
Spoken language identification (LID) technologies have improved in recent years from
discriminating largely distinct languages to discriminating highly similar languages or even …
discriminating largely distinct languages to discriminating highly similar languages or even …
Cross-corpora language recognition: A preliminary investigation with Indian languages
In this paper, we conduct one of the very first studies for cross-corpora performance
evaluation in the spoken language identification (LID) problem. Cross-corpora evaluation …
evaluation in the spoken language identification (LID) problem. Cross-corpora evaluation …
Study on the effect of emotional speech on language identification
P Jain, K Gurugubelli… - 2020 national conference …, 2020 - ieeexplore.ieee.org
Identifying language information from speech utterance is referred to as spoken language
identification. Language Identification (LID) is essential in multilingual speech systems. The …
identification. Language Identification (LID) is essential in multilingual speech systems. The …
Boosting Character-based Mandarin ASR via Chinese Pinyin Representation
L Li, Y Long, D Xu, Y Li - International Journal of Speech Technology, 2023 - Springer
Current end-to-end automatic speech recognition (ASR) models have achieved good results
in phonetic language such as English and French. However, Chinese character is a typical …
in phonetic language such as English and French. However, Chinese character is a typical …
Universal and accent-discriminative encoders for conformer-based accent-invariant speech recognition
X Wang, Y Long, D Xu - International Journal of Speech Technology, 2022 - Springer
Accent-variation is a challenging issue, either for traditional hybrid or current end-to-end
(E2E) automatic speech recognition (ASR). Building an accent-invariant and high quality …
(E2E) automatic speech recognition (ASR). Building an accent-invariant and high quality …