Sudo rm-rf: Efficient networks for universal audio source separation

C Subakan, M Ravanelli, S Cornell… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

Recurrent Neural Networks (RNNs) have long been the dominant architecture in sequence-
to-sequence learning. RNNs, however, are inherently sequential models that do not allow …

被引用次数：470 相关文章所有 7 个版本

[PDF] springer.com

Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer

Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

被引用次数：14 相关文章所有 8 个版本

[PDF] github.io

Two heads are better than one: A two-stage complex spectral mapping approach for monaural speech enhancement

A Li, W Liu, C Zheng, C Fan, X Li - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org

For challenging acoustic scenarios as low signal-to-noise ratios, current speech
enhancement systems usually suffer from performance bottleneck in extracting the target …

被引用次数：129 相关文章所有 3 个版本

[PDF] arxiv.org

Glance and gaze: A collaborative learning framework for single-channel speech enhancement

A Li, C Zheng, L Zhang, X Li - Applied Acoustics, 2022 - Elsevier

The capability of the human to pay attention to both coarse and fine-grained regions has
been applied to computer vision tasks. Motivated by that, we propose a collaborative …

被引用次数：107 相关文章所有 3 个版本

[PDF] arxiv.org

TF-GridNet: Integrating full-and sub-band modeling for speech separation

ZQ Wang, S Cornell, S Choi, Y Lee… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org

We propose TF-GridNet for speech separation. The model is a novel deep neural network
(DNN) integrating full-and sub-band modeling in the time-frequency (TF) domain. It stacks …

被引用次数：52 相关文章所有 7 个版本

[PDF] arxiv.org

TF-GridNet: Making time-frequency domain models great again for monaural speaker separation

ZQ Wang, S Cornell, S Choi, Y Lee… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

We propose TF-GridNet, a novel multi-path deep neural network (DNN) operating in the time-
frequency (TF) domain, for monaural talker-independent speaker separation in anechoic …

被引用次数：57 相关文章所有 5 个版本

[PDF] ieee.org

Remixit: Continual self-training of speech enhancement models via bootstrapped remixing

E Tzinis, Y Adi, VK Ithapu, B Xu… - IEEE Journal of …, 2022 - ieeexplore.ieee.org

We present RemixIT, a simple yet effective self-supervised method for training speech
enhancement without the need of a single isolated in-domain speech nor a noise waveform …

被引用次数：42 相关文章所有 5 个版本

[PDF] arxiv.org

An efficient encoder-decoder architecture with top-down attention for speech separation

K Li, R Yang, X Hu - arXiv preprint arXiv:2209.15200, 2022 - arxiv.org

Deep neural networks have shown excellent prospects in speech separation tasks.
However, obtaining good results while keeping a low model complexity remains challenging …

被引用次数：34 相关文章所有 3 个版本

[PDF] neurips.cc

Speech separation using an asynchronous fully recurrent convolutional neural network

X Hu, K Li, W Zhang, Y Luo… - Advances in …, 2021 - proceedings.neurips.cc

Recent advances in the design of neural network architectures, in particular those
specialized in modeling sequences, have provided significant improvements in speech …

被引用次数：40 相关文章所有 8 个版本

[PDF] arxiv.org

Mossformer: Pushing the performance limit of monaural speech separation using gated single-head transformer with convolution-augmented joint self-attentions

S Zhao, B Ma - … 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org

Transformer based models have provided significant performance improvements in
monaural speech separation. However, there is still a performance gap compared to a …

被引用次数：32 相关文章所有 3 个版本