Self-attentive speaker embeddings for text-independent speaker verification.

Z Bai, XL Zhang - Neural Networks, 2021 - Elsevier

Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …

被引用次数：425 相关文章所有 9 个版本

[PDF] arxiv.org

Multimodal intelligence: Representation learning, information fusion, and applications

C Zhang, Z Yang, X He, L Deng - IEEE Journal of Selected …, 2020 - ieeexplore.ieee.org

Deep learning methods haverevolutionized speech recognition, image recognition, and
natural language processing since 2010. Each of these tasks involves a single modality in …

被引用次数：420 相关文章所有 3 个版本

[PDF] arxiv.org

Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification

B Desplanques, J Thienpondt, K Demuynck - arXiv preprint arXiv …, 2020 - arxiv.org

Current speaker verification techniques rely on a neural network to extract speaker
representations. The successful x-vector architecture is a Time Delay Neural Network …

被引用次数：1613 相关文章所有 15 个版本

[PDF] peerj.com

Characterization inference based on joint-optimization of multi-layer semantics and deep fusion matching network

W Zheng, L Yin - PeerJ Computer Science, 2022 - peerj.com

The whole sentence representation reasoning process simultaneously comprises a
sentence representation module and a semantic reasoning module. This paper combines …

被引用次数：117 相关文章所有 8 个版本

[PDF] mdpi.com

A deep fusion matching network semantic reasoning model

W Zheng, Y Zhou, S Liu, J Tian, B Yang, L Yin - Applied Sciences, 2022 - mdpi.com

As the vital technology of natural language understanding, sentence representation
reasoning technology mainly focuses on sentence representation methods and reasoning …

被引用次数：127 相关文章所有 7 个版本

[PDF] researchgate.net

Attention, please! A survey of neural attention models in deep learning

A de Santana Correia, EL Colombini - Artificial Intelligence Review, 2022 - Springer

In humans, Attention is a core property of all perceptual and cognitive operations. Given our
limited ability to process competing sources, attention mechanisms select, modulate, and …

被引用次数：221 相关文章所有 8 个版本

[PDF] arxiv.org

A comparative study on recent neural spoofing countermeasures for synthetic speech detection

X Wang, J Yamagishi - arXiv preprint arXiv:2103.11326, 2021 - arxiv.org

A great deal of recent research effort on speech spoofing countermeasures has been
invested into back-end neural networks and training criteria. We contribute to this effort with …

被引用次数：190 相关文章所有 7 个版本

[PDF] mdpi.com

Sentence representation method based on multi-layer semantic network

W Zheng, X Liu, L Yin - Applied sciences, 2021 - mdpi.com

With the development of artificial intelligence, more and more people hope that computers
can understand human language through natural language technology, learn to think like …

被引用次数：152 相关文章所有 9 个版本

[PDF] arxiv.org

Large-scale self-supervised speech representation learning for automatic speaker verification

Z Chen, S Chen, Y Wu, Y Qian, C Wang… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

The speech representations learned from large-scale unlabeled data have shown better
generalizability than those from supervised learning and thus attract a lot of interest to be …

被引用次数：135 相关文章所有 3 个版本

[PDF] arxiv.org

End-to-end neural speaker diarization with self-attention

Y Fujita, N Kanda, S Horiguchi, Y Xue… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

Speaker diarization has been mainly developed based on the clustering of speaker
embeddings. However, the clustering-based approach has two major problems; ie,(i) it is not …

被引用次数：286 相关文章所有 7 个版本