A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Superb@ slt 2022: Challenge on generalization and efficiency of self-supervised speech representation learning

T Feng, A Dong, CF Yeh, S Yang, TQ Lin… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised
speech representation for better performance, generalization, and efficiency. The challenge …

QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning

H Guo, F Xie, J Kang, Y Xiao, X Wu… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
This paper proposes a novel semi-supervised TTS framework, QS-TTS, to improve TTS
quality with lower supervised data requirements via Vector-Quantized Self-Supervised …

Improving self-supervised learning model for audio spoofing detection with layer-conditioned embedding fusion

S Sinha, S Dey, G Saha - Computer Speech & Language, 2024 - Elsevier
The application of voice recognition systems has increased by a great deal with technology.
This has allowed adversaries to falsely claim access to these systems by spoofing the …

CTCBERT: Advancing hidden-unit BERT with CTC objectives

R Fan, Y Wang, Y Gaur, J Li - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
In this work, we present a simple but effective method, CTCBERT, for advancing hidden-unit
BERT (HuBERT). HuBERT applies a frame-level cross-entropy (CE) loss, which is similar to …