SpeechBrain: A general-purpose speech toolkit
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …
research and development of neural speech processing technologies by being simple …
Bayesian hmm clustering of x-vector sequences (vbx) in speaker diarization: theory, implementation and analysis on standard tasks
The recently proposed VBx diarization method uses a Bayesian hidden Markov model to
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …
find speaker clusters in a sequence of x-vectors. In this work we perform an extensive …
Deep speaker recognition: Process, progress, and challenges
Speaker recognition is related to human biometrics dealing with the identification of
speakers from their speech. Speaker recognition is an active research area and being …
speakers from their speech. Speaker recognition is an active research area and being …
Titanet: Neural model for speaker representation with 1d depth-wise separable convolutions and global context
In this paper, we propose TitaNet, a novel neural network architecture for extracting speaker
representations. We employ 1D depth-wise separable convolutions with Squeeze-and …
representations. We employ 1D depth-wise separable convolutions with Squeeze-and …
ECAPA-TDNN embeddings for speaker diarization
Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural
networks can accurately capture speaker discriminative characteristics and popular deep …
networks can accurately capture speaker discriminative characteristics and popular deep …
序列数据的数据增强方法综述.
葛轶洲, 许翔, 杨锁荣, 周青… - Journal of Frontiers of …, 2021 - search.ebscohost.com
为了追求精度, 深度学习模型框架的结构越来越复杂, 网络越来越深. 参数量的增加意味着训练
模型需要更多的数据. 然而人工标注数据的成本是高昂的, 且受客观原因所限 …
模型需要更多的数据. 然而人工标注数据的成本是高昂的, 且受客观原因所限 …
Meta-generalization for domain-invariant speaker verification
Automatic speaker verification (ASV) exhibits unsatisfactory performance under domain
mismatch conditions owing to intrinsic and extrinsic factors, such as variations in speaking …
mismatch conditions owing to intrinsic and extrinsic factors, such as variations in speaking …
Combination of deep speaker embeddings for diarisation
Significant progress has recently been made in speaker diarisation after the introduction of d-
vectors as speaker embeddings extracted from neural network (NN) speaker classifiers for …
vectors as speaker embeddings extracted from neural network (NN) speaker classifiers for …
Optimized speaker change detection approach for speaker segmentation towards speaker diarization based on deep learning
K VijayKumar - Data & Knowledge Engineering, 2023 - Elsevier
Speaker diarization is the partitioning of an audio source stream into homogeneous
segments according to the speaker's identity. It can improve the readability of an automatic …
segments according to the speaker's identity. It can improve the readability of an automatic …
U-vectors: Generating clusterable speaker embedding from unlabeled data
Speaker recognition deals with recognizing speakers by their speech. Most speaker
recognition systems are built upon two stages, the first stage extracts low dimensional …
recognition systems are built upon two stages, the first stage extracts low dimensional …