Speech2face: Learning the face behind a voice
How much can we infer about a person's looks from the way they speak? In this paper, we
study the task of reconstructing a facial image of a person from a short audio recording of …
study the task of reconstructing a facial image of a person from a short audio recording of …
Voice-face homogeneity tells deepfake
Detecting forgery videos is highly desirable due to the abuse of deepfake. Existing detection
approaches contribute to exploring the specific artifacts in deepfake videos and fit well on …
approaches contribute to exploring the specific artifacts in deepfake videos and fit well on …
Audio-visual deep neural network for robust person verification
Voice and face are two most popular biometrics for person verification, usually used in
speaker verification and face verification tasks. It has already been observed that simply …
speaker verification and face verification tasks. It has already been observed that simply …
EmoMV: Affective music-video correspondence learning datasets for classification and retrieval
HTP Thao, G Roig, D Herremans - Information Fusion, 2023 - Elsevier
Studies in affective audio–visual correspondence learning require ground-truth data to train,
validate, and test models. The number of available datasets together with benchmarks …
validate, and test models. The number of available datasets together with benchmarks …
Recent advances and challenges in deep audio-visual correlation learning
Audio-visual correlation learning aims to capture essential correspondences and
understand natural phenomena between audio and video. With the rapid growth of deep …
understand natural phenomena between audio and video. With the rapid growth of deep …
Disentangled representation learning for cross-modal biometric matching
Cross-modal biometric matching (CMBM) aims to determine the corresponding voice from a
face, or identify the corresponding face from a voice. Recently, many CMBM methods have …
face, or identify the corresponding face from a voice. Recently, many CMBM methods have …
Seeking the shape of sound: An adaptive framework for learning voice-face association
Nowadays, we have witnessed the early progress on learning the association between
voice and face automatically, which brings a new wave of studies to the computer vision …
voice and face automatically, which brings a new wave of studies to the computer vision …
Audio-visual speaker recognition with a cross-modal discriminative network
Audio-visual speaker recognition is one of the tasks in the recent 2019 NIST speaker
recognition evaluation (SRE). Studies in neuroscience and computer science all point to the …
recognition evaluation (SRE). Studies in neuroscience and computer science all point to the …
Noise-tolerant audio-visual online person verification using an attention-based neural network fusion
In this paper, we present a multi-modal online person verification system using both speech
and visual signals. Inspired by neuroscientific findings on the association of voice and face …
and visual signals. Inspired by neuroscientific findings on the association of voice and face …
Cross-modal speaker verification and recognition: A multilingual perspective
Recent years have seen a surge in finding association between faces and voices within a
cross-modal biometric application along with speaker recognition. Inspired from this, we …
cross-modal biometric application along with speaker recognition. Inspired from this, we …