Mixing audiovisual speech processing and blind source separation for the extraction of speech...

D Michelsanti, ZH Tan, SX Zhang, Y Xu… - … on Audio, Speech …, 2021 - ieeexplore.ieee.org

Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …

被引用次数：291 相关文章所有 6 个版本

[PDF] thecvf.com

Learning to separate object sounds by watching unlabeled video

R Gao, R Feris, K Grauman - Proceedings of the European …, 2018 - openaccess.thecvf.com

Perceiving a scene most fully requires all the senses. Yet modeling how objects look and
sound is challenging: most natural scenes and events contain multiple objects, and the …

被引用次数：320 相关文章所有 14 个版本

[PDF] arxiv.org

Audio-visual speech enhancement using multimodal deep convolutional neural networks

JC Hou, SS Wang, YH Lai, Y Tsao… - … on Emerging Topics …, 2018 - ieeexplore.ieee.org

Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques
focus only on addressing audio information. In this paper, inspired by multimodal learning …

被引用次数：268 相关文章所有 12 个版本

[PDF] hal.science

Audiovisual speech source separation: An overview of key methodologies

B Rivet, W Wang, SM Naqvi… - IEEE Signal Processing …, 2014 - ieeexplore.ieee.org

The separation of speech signals measured at multiple microphones in noisy and
reverberant environments using only the audio modality has limitations because there is …

被引用次数：89 相关文章所有 13 个版本

[PDF] psu.edu

Audiovisual fusion: Challenges and new approaches

AK Katsaggelos, S Bahaadini… - Proceedings of the …, 2015 - ieeexplore.ieee.org

In this paper, we review recent results on audiovisual (AV) fusion. We also discuss some of
the challenges and report on approaches to address them. One important issue in AV fusion …

被引用次数：148 相关文章所有 8 个版本

[PDF] arxiv.org

Audio-visual speaker diarization based on spatiotemporal bayesian fusion

ID Gebru, S Ba, X Li, R Horaud - IEEE transactions on pattern …, 2017 - ieeexplore.ieee.org

Speaker diarization consists of assigning speech signals to people engaged in a dialogue.
An audio-visual spatiotemporal diarization model is proposed. The model is well suited for …

被引用次数：127 相关文章所有 12 个版本

[PDF] ieee.org

Audio–visual deep clustering for speech separation

R Lu, Z Duan, C Zhang - IEEE/ACM Transactions on Audio …, 2019 - ieeexplore.ieee.org

Speech separation aims to separate individual voices from an audio mixture of multiple
simultaneous talkers. Audio-only approaches show unsatisfactory performance when the …

被引用次数：55 相关文章所有 7 个版本

[PDF] sagepub.com

Dynamic key-updating: Privacy-preserving authentication for RFID systems

L Lu, J Han, L Hu, LM Ni - International Journal of …, 2012 - journals.sagepub.com

The objective of private authentication for Radio Frequency Identification (RFID) systems is
to allow valid readers to explicitly authenticate their dominated tags without leaking the …

被引用次数：164 相关文章所有 21 个版本

[PDF] ieee.org

Listen and look: Audio–visual matching assisted speech source separation

R Lu, Z Duan, C Zhang - IEEE Signal Processing Letters, 2018 - ieeexplore.ieee.org

Source permutation, ie, assigning separated signal snippets to wrong sources over time, is a
major issue in the state-of-the-art speaker-independent speech source separation methods …

被引用次数：56 相关文章所有 6 个版本

[PDF] hal.science

Blind audiovisual source separation based on sparse redundant representations

AL Casanovas, G Monaci… - IEEE Transactions …, 2010 - ieeexplore.ieee.org

In this paper, we propose a novel method which is able to detect and separate audiovisual
sources present in a scene. Our method exploits the correlation between the video signal …

被引用次数：91 相关文章所有 19 个版本