A time-frequency attention module for neural speech enhancement

N Saleem, TS Gunawan, M Kartiwi, BS Nugroho… - IEEE …, 2023 - ieeexplore.ieee.org

Speech enhancement (SE) is a critical aspect of various speech-processing applications.
Recent research in this field focuses on identifying effective ways to capture the long-term …

被引用次数：7 相关文章所有 3 个版本

[PDF] aaai.org

Restoring Speaking Lips from Occlusion for Audio-Visual Speech Recognition

J Wang, Z Pan, M Zhang, RT Tan, H Li - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Prior studies on audio-visual speech recognition typically assume the visibility of speaking
lips, ignoring the fact that visual occlusion occurs in real-world videos, thus adversely …

被引用次数：1 相关文章

[PDF] ieee.org

Multi-attention bottleneck for gated convolutional encoder-decoder-based speech enhancement

N Saleem, TS Gunawan, M Shafi, S Bourouis… - IEEE …, 2023 - ieeexplore.ieee.org

Convolutional encoder-decoder (CED) has emerged as a powerful architecture, particularly
in speech enhancement (SE), which aims to improve the intelligibility and quality and …

被引用次数：5 相关文章所有 3 个版本

[PDF] arxiv.org

An empirical study on the impact of positional encoding in transformer-based monaural speech enhancement

Q Zhang, M Ge, H Zhu, E Ambikairajah… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Transformer architecture has enabled recent progress in speech enhancement. Since
Transformers are position-agostic, positional encoding is the de facto standard component …

被引用次数：3 相关文章所有 3 个版本

Multi-stage progressive learning-based speech enhancement using time–frequency attentive squeezed temporal convolutional networks

C Jannu, SD Vanambathina - Circuits, Systems, and Signal Processing, 2023 - Springer

Speech enhancement is an important method for improving speech quality and intelligibility
in noisy environments. An effective speech enhancement model depends on precise …

被引用次数：4 相关文章所有 3 个版本

Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection

C Fan, M Ding, J Tao, R Fu, J Yi… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Most research in synthetic speech detection (SSD) focuses on improving performance on
standard noise-free datasets. However, in actual situations, noise interference is usually …

被引用次数：1 相关文章

[PDF] arxiv.org

Ripple sparse self-attention for monaural speech enhancement

Q Zhang, H Zhu, Q Song, X Qian… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

The use of Transformer represents a recent success in speech enhancement. However, as
its core component, self-attention suffers from quadratic complexity, which is computationally …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Mamba in Speech: Towards an Alternative to Self-Attention

X Zhang, Q Zhang, H Liu, T Xiao, X Qian… - arXiv preprint arXiv …, 2024 - arxiv.org

Transformer and its derivatives have achieved success in diverse tasks across computer
vision, natural language processing, and speech processing. To reduce the complexity of …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement

BJ Borgström, MS Brandstein - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org

Neural network approaches to single-channel speech enhancement have received much
recent attention. In particular, mask-based architectures have achieved significant …

被引用次数：1 相关文章所有 3 个版本

[PDF] google.com

[PDF][PDF] Monaural speech separation method based on recurrent attention with parallel branches

X Yang, C Bao, X Zhang, X Chen - Proc. Interspeech, 2023 - drive.google.com

In many speech separation methods, the contextual information contained in the feature
sequence is mainly modeled by recurrent layer and/or self-attention mechanism. However …

被引用次数：3 相关文章所有 3 个版本