Audio-visual deep neural network for robust person verification
Voice and face are two most popular biometrics for person verification, usually used in
speaker verification and face verification tasks. It has already been observed that simply …
speaker verification and face verification tasks. It has already been observed that simply …
[HTML][HTML] Validations of an alpha version of the E3 Forensic Speech Science System (E3FS3) core software tools
This paper reports on validations of an alpha version of the E 3 Forensic Speech Science
System (E 3 FS 3) core software tools. This is an open-code human-supervised-automatic …
System (E 3 FS 3) core software tools. This is an open-code human-supervised-automatic …
[PDF][PDF] A Method of Audio-Visual Person Verification by Mining Connections between Time Series
P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - Proc …, 2023 - isca-archive.org
It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. But the relationship of keyframes in time series between …
embedding for person verification. But the relationship of keyframes in time series between …
[HTML][HTML] Validation of an ECAPA-TDNN system for Forensic Automatic Speaker Recognition under case work conditions
F Sigona, M Grimaldi - Speech Communication, 2024 - Elsevier
In this work, we tested different variants of a Forensic Automatic Speaker Recognition
(FASR) system based on Emphasized Channel Attention, Propagation and Aggregation in …
(FASR) system based on Emphasized Channel Attention, Propagation and Aggregation in …
Audio-Visual Person Verification based on Recursive Fusion of Joint Cross-Attention
RG Praveen, J Alam - arXiv preprint arXiv:2403.04654, 2024 - arxiv.org
Person or identity verification has been recently gaining a lot of attention using audio-visual
fusion as faces and voices share close associations with each other. Conventional …
fusion as faces and voices share close associations with each other. Conventional …
CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition
Audio-visual person recognition (AVPR) has received extensive attention. However, most
datasets used for AVPR research so far are collected in constrained environments, and thus …
datasets used for AVPR research so far are collected in constrained environments, and thus …
Audio–Visual Fusion Based on Interactive Attention for Person Verification
With the rapid development of multimedia technology, personnel verification systems have
become increasingly important in the security field and identity verification. However …
become increasingly important in the security field and identity verification. However …
Learning Audio-Visual embedding for Person Verification in the Wild
P Sun, S Zhang, Z Liu, Y Yuan, T Zhang… - arXiv preprint arXiv …, 2022 - arxiv.org
It has already been observed that audio-visual embedding is more robust than uni-modality
embedding for person verification. Here, we proposed a novel audio-visual strategy that …
embedding for person verification. Here, we proposed a novel audio-visual strategy that …
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
The human brain has the capability to associate the unknown person's voice and face by
leveraging their general relationship, referred to as``cross-modal speaker verification''. This …
leveraging their general relationship, referred to as``cross-modal speaker verification''. This …
Dynamic Cross Attention for Audio-Visual Person Verification
RG Praveen, J Alam - arXiv preprint arXiv:2403.04661, 2024 - arxiv.org
Although person or identity verification has been predominantly explored using individual
modalities such as face and voice, audio-visual fusion has recently shown immense …
modalities such as face and voice, audio-visual fusion has recently shown immense …