Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
Deep representation learning in speech processing: Challenges, recent advances, and future trends
Research on speech processing has traditionally considered the task of designing hand-
engineered acoustic features (feature engineering) as a separate distinct problem from the …
engineered acoustic features (feature engineering) as a separate distinct problem from the …
Two-stream collaborative learning with spatial-temporal attention for video classification
Video classification is highly important and has widespread applications, such as video
search and intelligent surveillance. Video naturally contains both static and motion …
search and intelligent surveillance. Video naturally contains both static and motion …
Leveraging asr pretrained conformers for speaker verification through transfer learning and knowledge distillation
This paper focuses on the application of Conformers in speaker verification. Conformers,
initially designed for Automatic Speech Recognition (ASR), excel at modeling both local and …
initially designed for Automatic Speech Recognition (ASR), excel at modeling both local and …
GPRI2Net: A deep-neural-network-based ground penetrating radar data inversion and object identification framework for consecutive and long survey lines
Ground penetrating radar (GPR) enables infrastructure inspection using consecutive and
long survey lines. However, the existing GPR data processing methods may lead to …
long survey lines. However, the existing GPR data processing methods may lead to …
Speaker embedding extraction with phonetic information
Speaker embeddings achieve promising results on many speaker verification tasks.
Phonetic information, as an important component of speech, is rarely considered in the …
Phonetic information, as an important component of speech, is rarely considered in the …
A semantic-aware strategy for automatic speech recognition incorporating deep learning models
A Santhanavijayan, D Naresh Kumar… - Intelligent System Design …, 2021 - Springer
Abstract Automatic Speech Recognition (ASR) is trending in the age of the Internet of Things
and Machine Intelligence. It plays a pivotal role in several applications. Conventional …
and Machine Intelligence. It plays a pivotal role in several applications. Conventional …
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition
Accents pose significant challenges for speech recognition systems. Although joint
automatic speech recognition (ASR) and accent recognition (AR) training has been proven …
automatic speech recognition (ASR) and accent recognition (AR) training has been proven …
Multi-task twin bounded support vector machine and its safe screening rule
R An, Y Xu, X Liu - Applied Soft Computing, 2023 - Elsevier
Direct multi-task twin support vector machine (DMTSVM) obtains great performance in
dealing with correlated tasks. However, DMTSVM only considers the empirical risk …
dealing with correlated tasks. However, DMTSVM only considers the empirical risk …
Phoneme-unit-specific time-delay neural network for speaker verification
X Chen, C Bao - IEEE/ACM Transactions on Audio, Speech …, 2021 - ieeexplore.ieee.org
Variations of speech content increase the difficulty of speaker verification. In this paper, to
alleviate the negative effect of the variations, phoneme-unit-specific time-delay neural …
alleviate the negative effect of the variations, phoneme-unit-specific time-delay neural …