Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
Multimodal intelligence: Representation learning, information fusion, and applications
Deep learning methods haverevolutionized speech recognition, image recognition, and
natural language processing since 2010. Each of these tasks involves a single modality in …
natural language processing since 2010. Each of these tasks involves a single modality in …
Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification
Current speaker verification techniques rely on a neural network to extract speaker
representations. The successful x-vector architecture is a Time Delay Neural Network …
representations. The successful x-vector architecture is a Time Delay Neural Network …
Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network
The whole sentence representation reasoning process simultaneously comprises a
sentence representation module and a semantic reasoning module. This paper combines …
sentence representation module and a semantic reasoning module. This paper combines …
A deep fusion matching network semantic reasoning model
As the vital technology of natural language understanding, sentence representation
reasoning technology mainly focuses on sentence representation methods and reasoning …
reasoning technology mainly focuses on sentence representation methods and reasoning …
Attention, please! A survey of neural attention models in deep learning
A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer
In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …
limited ability to process competing sources, attention mechanisms select, modulate, and …
A comparative study on recent neural spoofing countermeasures for synthetic speech detection
X Wang, J Yamagishi - arXiv preprint arXiv:2103.11326, 2021 - arxiv.org
A great deal of recent research effort on speech spoofing countermeasures has been
invested into back-end neural networks and training criteria. We contribute to this effort with …
invested into back-end neural networks and training criteria. We contribute to this effort with …
Sentence representation method based on multi-layer semantic network
With the development of artificial intelligence, more and more people hope that computers
can understand human language through natural language technology, learn to think like …
can understand human language through natural language technology, learn to think like …
Large-scale self-supervised speech representation learning for automatic speaker verification
The speech representations learned from large-scale unlabeled data have shown better
generalizability than those from supervised learning and thus attract a lot of interest to be …
generalizability than those from supervised learning and thus attract a lot of interest to be …
End-to-end neural speaker diarization with self-attention
Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …