Audio-visual deep neural network for robust person verification

Y Qian, Z Chen, S Wang - IEEE/ACM Transactions on Audio …, 2021 - ieeexplore.ieee.org
Voice and face are two most popular biometrics for person verification, usually used in
speaker verification and face verification tasks. It has already been observed that simply …

[HTML][HTML] Validations of an alpha version of the E3 Forensic Speech Science System (E3FS3) core software tools

P Weber, E Enzinger, B Labrador… - Forensic Science …, 2022 - Elsevier
This paper reports on validations of an alpha version of the E 3 Forensic Speech Science
System (E 3 FS 3) core software tools. This is an open-code human-supervised-automatic …

[PDF][PDF] A Method of Audio-Visual Person Verification by Mining Connections between Time Series

P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - Proc …, 2023 - isca-archive.org
It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. But the relationship of keyframes in time series between …

[HTML][HTML] Validation of an ECAPA-TDNN system for Forensic Automatic Speaker Recognition under case work conditions

F Sigona, M Grimaldi - Speech Communication, 2024 - Elsevier
In this work, we tested different variants of a Forensic Automatic Speaker Recognition
(FASR) system based on Emphasized Channel Attention, Propagation and Aggregation in …

Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention

RG Praveen, J Alam - arXiv preprint arXiv:2403.04654, 2024 - arxiv.org
Person or identity verification has been recently gaining a lot of attention using audio-visual
fusion as faces and voices share close associations with each other. Conventional …

CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition

L Li, X Li, H Jiang, C Chen, R Hou, D Wang - arXiv preprint arXiv …, 2023 - arxiv.org
Audio-visual person recognition (AVPR) has received extensive attention. However, most
datasets used for AVPR research so far are collected in constrained environments, and thus …

Audio–Visual Fusion Based on Interactive Attention for Person Verification

X Jing, L He, Z Song, S Wang - Sensors, 2023 - mdpi.com
With the rapid development of multimedia technology, personnel verification systems have
become increasingly important in the security field and identity verification. However …

Learning Audio-Visual embedding for Person Verification in the Wild

P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - arXiv preprint arXiv …, 2022 - arxiv.org
It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. Here, we proposed a novel audio-visual strategy that …

Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization

R Tao, Z Shi, Y Jiang, DT Truong, ES Chng… - arXiv preprint arXiv …, 2024 - arxiv.org
The human brain has the capability to associate the unknown person's voice and face by
leveraging their general relationship, referred to as``cross-modal speaker verification''. This …

Dynamic Cross Attention for Audio-Visual Person Verification

RG Praveen, J Alam - arXiv preprint arXiv:2403.04661, 2024 - arxiv.org
Although person or identity verification has been predominantly explored using individual
modalities such as face and voice, audio-visual fusion has recently shown immense …