Personalized percepnet: Real-time, low-complexity target voice separation and enhancement

R Giri, S Venkataramani, JM Valin, U Isik… - arXiv preprint arXiv …, 2021 - arxiv.org
The presence of multiple talkers in the surrounding environment poses a difficult challenge
for real-time speech communication systems considering the constraints on network size …

Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation

UH Shin, S Lee, T Kim, HM Park - arXiv preprint arXiv:2406.05983, 2024 - arxiv.org
Since the success of a time-domain speech separation, further improvements have been
made by expanding the length and channel of a feature sequence to increase the amount of …

Boosting the Performance of SpEx+ by Attention and Contextual Mechanism

C Li, Z Wu, W Rao, Y Wang… - 2022 13th International …, 2022 - ieeexplore.ieee.org
Target speaker extraction (TSE) aims to mimic human selective attention to extracting our
interested voice from the multi-talker environment. Time-domain methods represented by …

[PDF][PDF] Directional and Qualitative Feature Classification for Speaker Diarization with Dual Microphone Arrays.

S Astapov, D Popov, V Kabarov - MICSECS, 2020 - ceur-ws.org
Automatic meeting transcription has long been one of the common applications for natural
language processing methods. The quality of automatic meeting transcription for the cases …