Is someone speaking? exploring long-term temporal features for audio-visual active speaker detection R Tao, Z Pan, RK Das, X Qian, MZ Shou, H Li Proceedings of the 29th ACM international conference on multimedia, 3927-3935, 2021 | 151 | 2021 |
Multi-speaker tracking from an audio–visual sensing device X Qian, A Brutti, O Lanz, M Omologo, A Cavallaro IEEE Transactions on Multimedia 21 (10), 2576-2588, 2019 | 55 | 2019 |
3D audio-visual speaker tracking with an adaptive particle filter X Qian, A Brutti, M Omologo, A Cavallaro 2017 IEEE International Conference on Acoustics, Speech and Signal …, 2017 | 40 | 2017 |
Seeing what you said: Talking face generation guided by a lip reading expert J Wang, X Qian, M Zhang, RT Tan, H Li Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 38 | 2023 |
Multi-target DoA estimation with an audio-visual fusion mechanism X Qian, M Madhavi, Z Pan, J Wang, H Li ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 37 | 2021 |
3D mouth tracking from a compact microphone array co-located with a camera X Qian, A Xompero, A Cavallaro, A Brutti, O Lanz, M Omologo 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 24 | 2018 |
Audio-visual tracking of concurrent speakers X Qian, A Brutti, O Lanz, M Omologo, A Cavallaro IEEE Transactions on Multimedia 24, 942-954, 2021 | 23 | 2021 |
A time-frequency attention module for neural speech enhancement Q Zhang, X Qian, Z Ni, A Nicolson, E Ambikairajah, H Li IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 462-475, 2022 | 21 | 2022 |
Audio-visual cross-attention network for robotic speaker tracking X Qian, Z Wang, J Wang, G Guan, H Li IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 550-562, 2022 | 18 | 2022 |
Speaker extraction with co-speech gestures cue Z Pan, X Qian, H Li IEEE Signal Processing Letters 29, 1467-1471, 2022 | 17 | 2022 |
GCC-PHAT with speech-oriented attention for robotic sound source localization J Wang, X Qian, Z Pan, M Zhang, H Li 2021 IEEE International Conference on Robotics and Automation (ICRA), 5876-5883, 2021 | 13 | 2021 |
Deep audio-visual beamforming for speaker localization X Qian, Q Zhang, G Guan, W Xue IEEE Signal Processing Letters 29, 1132-1136, 2022 | 12 | 2022 |
L F-TOUCH: A Wireless GelSight with Decoupled Tactile and Three-axis Force Sensing W Li, M Wang, J Li, Y Su, DK Jha, X Qian, K Althoefer, H Liu IEEE Robotics and Automation Letters, 2023 | 10 | 2023 |
Predict-and-update network: Audio-visual speech recognition inspired by human speech perception J Wang, X Qian, H Li arXiv preprint arXiv:2209.01768, 2022 | 10 | 2022 |
Speech-oriented sparse attention denoising for voice user interface toward industry 5.0 H Zhu, Q Zhang, P Gao, X Qian IEEE Transactions on Industrial Informatics 19 (2), 2151-2160, 2022 | 8 | 2022 |
Locata challenge: speaker localization with a planar array X Qian, A Cavallaro, A Brutti, M Omologo arXiv preprint arXiv:1901.08983, 2019 | 7 | 2019 |
Is someone speaking R Tao, Z Pan, RK Das, X Qian, MZ Shou, H Li Proceedings of the 29th ACM International Conference on Multimedia, Oct, 2021 | 6 | 2021 |
Three-dimensional speaker localization: audio-refined visual scaling factor estimation X Qian, Q Liu, J Wang, H Li IEEE Signal Processing Letters 28, 1405-1409, 2021 | 6 | 2021 |
Audio-Visual Multi-Speaker Tracking Based on the GLMB Framework. S Lin, X Qian INTERSPEECH, 3082-3086, 2020 | 6 | 2020 |
Neural-free attention for monaural speech enhancement towards voice user interface for consumer electronics M Chen, Q Zhang, Q Song, X Qian, R Guo, M Wang, D Chen IEEE Transactions on Consumer Electronics, 2023 | 5 | 2023 |