A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Music source separation with band-split RNN

Y Luo, J Yu - IEEE/ACM Transactions on Audio, Speech, and …, 2023 - ieeexplore.ieee.org
The performance of music source separation (MSS) models has been greatly improved in
recent years thanks to the development of novel neural network architectures and training …

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement

S Leglaive, L Borne, E Tzinis, M Sadeghi… - arXiv preprint arXiv …, 2023 - arxiv.org
Supervised speech enhancement models are trained using artificially generated mixtures of
clean speech and noise signals, which may not match real-world recording conditions at test …

The intel neuromorphic DNS challenge

J Timcheck, SB Shrestha, DBD Rubin… - Neuromorphic …, 2023 - iopscience.iop.org
A critical enabler for progress in neuromorphic computing research is the ability to
transparently evaluate different neuromorphic solutions on important tasks and to compare …

UNSSOR: unsupervised neural speech separation by leveraging over-determined training mixtures

ZQ Wang, S Watanabe - Advances in Neural Information …, 2024 - proceedings.neurips.cc
In reverberant conditions with multiple concurrent speakers, each microphone acquires a
mixture signal of multiple speakers at a different location. In over-determined conditions …

Exploring wavlm on speech enhancement

H Song, S Chen, Z Chen, Y Wu… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
There is a surge in interest in self-supervised learning approaches for end-to-end speech
encoding in recent years as they have achieved great success. Especially, WavLM showed …

Efficient monaural speech enhancement with universal sample rate band-split RNN

J Yu, Y Luo - … 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
While recent developments on the design of neural networks have greatly advanced the
state-of-the-art of speech enhancement and separation systems, practical applications of …

Tokensplit: Using discrete speech representations for direct, refined, and transcript-conditioned speech separation and recognition

H Erdogan, S Wisdom, X Chang, Z Borsos… - arXiv preprint arXiv …, 2023 - arxiv.org
We present TokenSplit, a speech separation model that acts on discrete token sequences.
The model is trained on multiple tasks simultaneously: separate and transcribe each speech …

Speech separation with large-scale self-supervised learning

Z Chen, N Kanda, J Wu, Y Wu, X Wang… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Self-supervised learning (SSL) methods such as WavLM have shown promising speech
separation (SS) results in small-scale simulation-based experiments. In this work, we extend …