Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Self-supervised text-independent speaker verification using prototypical momentum contrastive learning

W Xia, C Zhang, C Weng, M Yu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
In this study, we investigate self-supervised representation learning for speaker verification
(SV). First, we examine a simple contrastive learning approach (SimCLR) with a momentum …

C3-DINO: Joint contrastive and non-contrastive self-supervised learning for speaker verification

C Zhang, D Yu - IEEE Journal of Selected Topics in Signal …, 2022 - ieeexplore.ieee.org
Self-supervised learning (SSL) has drawn an increased attention in the field of speech
processing. Recent studies have demonstrated that contrastive learning is able to learn …

Durian-sc: Duration informed attention network based singing voice conversion system

L Zhang, C Yu, H Lu, C Weng, C Zhang, Y Wu… - arXiv preprint arXiv …, 2020 - arxiv.org
Singing voice conversion is converting the timbre in the source singing to the target
speaker's voice while keeping singing content the same. However, singing data for target …

NeuralEcho: A self-attentive recurrent neural network for unified acoustic echo suppression and speech enhancement

M Yu, Y Xu, C Zhang, SX Zhang, D Yu - arXiv preprint arXiv:2205.10401, 2022 - arxiv.org
Acoustic echo cancellation (AEC) plays an important role in the full-duplex speech
communication as well as the front-end speech enhancement for recognition in the …

Towards robust speaker verification with target speaker enhancement

C Zhang, M Yu, C Weng, D Yu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
This paper proposes the target speaker enhancement based speaker verification network
(TASE-SVNet), an all neural model that couples target speaker enhancement and speaker …

SEDENOSS: SEparating and DENOising Seismic Signals with dual‐path recurrent neural network architecture

A Novoselov, P Balazs… - Journal of Geophysical …, 2022 - Wiley Online Library
Seismologists have to deal with overlapping and noisy signals. Techniques such as source
separation can be used to solve this problem. Over the past few decades, signal processing …

Discriminative speaker embedding with serialized multi-layer multi-head attention

H Zhu, KA Lee, H Li - Speech Communication, 2022 - Elsevier
In this paper, a serialized multi-layer multi-head attention is proposed for extracting neural
speaker embedding in text-independent speaker verification task. The majority of the recent …

Quantitative evidence on overlooked aspects of enrollment speaker embeddings for target speaker separation

X Liu, X Li, J Serrà - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Single channel target speaker separation (TSS) aims at extracting a speaker's voice from a
mixture of multiple talkers given an enrollment utterance of that speaker. A typical deep …

Monaural speech separation using speaker embedding from preliminary separation

J Byun, JW Shin - IEEE/ACM Transactions on Audio, Speech …, 2021 - ieeexplore.ieee.org
In speech separation, the identities of the speakers may be an important cue to discriminate
speeches in the mixture and separate them better. A few recent researches used the …