A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Cn-celeb: multi-genre speaker recognition

L Li, R Liu, J Kang, Y Fan, H Cui, Y Cai, R Vipperla… - Speech …, 2022 - Elsevier
Research on speaker recognition is extending to address the vulnerability in the wild
conditions, among which genre mismatch is perhaps the most challenging, for instance …

Domain generalization with relaxed instance frequency-wise normalization for multi-device acoustic scene classification

B Kim, S Yang, J Kim, H Park, J Lee… - arXiv preprint arXiv …, 2022 - arxiv.org
While using two-dimensional convolutional neural networks (2D-CNNs) in image
processing, it is possible to manipulate domain information using channel statistics, and …

Playing a part: Speaker verification at the movies

A Brown, J Huh, A Nagrani, JS Chung… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
The goal of this work is to investigate the performance of popular speaker recognition
models on speech segments from movies, where often actors intentionally disguise their …

A model-agnostic meta-baseline method for few-shot fault diagnosis of wind turbines

X Liu, W Teng, Y Liu - Sensors, 2022 - mdpi.com
The technology of fault diagnosis is helpful to improve the reliability of wind turbines, and
further reduce the operation and maintenance cost at wind farms. However, in reality, wind …

Meta-generalization for domain-invariant speaker verification

H Zhang, L Wang, KA Lee, M Liu… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org
Automatic speaker verification (ASV) exhibits unsatisfactory performance under domain
mismatch conditions owing to intrinsic and extrinsic factors, such as variations in speaking …

Domain agnostic few-shot learning for speaker verification

S Yang, D Das, J Cho, H Park, S Yun - arXiv preprint arXiv:2206.13700, 2022 - arxiv.org
Deep learning models for verification systems often fail to generalize to new users and new
environments, even though they learn highly discriminative features. To address this …

Model-Agnostic Meta-Learning for Fast Text-Dependent Speaker Embedding Adaptation

W Lin, MW Mak - IEEE/ACM Transactions on Audio, Speech …, 2023 - ieeexplore.ieee.org
By constraining the lexical content of input speech, text-dependent speaker verification (TD-
SV) offers more reliable performance than text-independent speaker verification (TI-SV) …

Learning domain-invariant transformation for speaker verification

H Zhang, L Wang, KA Lee, M Liu… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Automatic speaker verification (ASV) faces domain shift caused by the mismatch of intrinsic
and extrinsic factors such as recording device and speaking style in real-world applications …

Improving Generalization Ability of Countermeasures for New Mismatch Scenario by Combining Multiple Advanced Regularization Terms

C Zeng, X Wang, X Miao, E Cooper… - arXiv preprint arXiv …, 2023 - arxiv.org
The ability of countermeasure models to generalize from seen speech synthesis methods to
unseen ones has been investigated in the ASVspoof challenge. However, a new mismatch …