WASD: A Wilder Active Speaker Detection Dataset

T Roxo, JC Costa, PRM Inácio… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

State-of-the-art Active Speaker Detection (ASD) approaches heavily rely on audio and facial
features to perform, which is not a sustainable approach in wild scenarios. Although these …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

Audio-Visual Speaker Diarization: Current Databases, Approaches and Challenges

V Mingote, A Ortega, A Miguel, E Lleida - arXiv preprint arXiv:2409.05659, 2024 - arxiv.org

Nowadays, the large amount of audio-visual content available has fostered the need to
develop new robust automatic speaker diarization systems to analyse and characterise it …

FabuLight-ASD: unveiling speech activity via body language

H Carneiro, S Wermter - Neural Computing and Applications, 2024 - Springer

Active speaker detection (ASD) in multimodal environments is crucial for various
applications, from video conferencing to human-robot interaction. This paper introduces …

ASDnB: Merging Face with Body Cues For Robust Active Speaker Detection

T Roxo, JC Costa, P Inácio, H Proença - arXiv preprint arXiv:2412.08594, 2024 - arxiv.org

State-of-the-art Active Speaker Detection (ASD) approaches mainly use audio and facial
features as input. However, the main hypothesis in this paper is that body dynamics is also …

How to Squeeze An Explanation Out of Your Model

T Roxo, JC Costa, PRM Inácio, H Proença - arXiv preprint arXiv …, 2024 - arxiv.org

Deep learning models are widely used nowadays for their reliability in performing various
tasks. However, they do not typically provide the reasoning behind their decision, which is a …

Detección de actividad del habla en vídeos

JM Acosta Triana - 2023 - riunet.upv.es

[ES] La detección de actividad del habla en vídeos consiste en identificar el rostro de la
persona que está hablando en cada momento de la escena. Este desafío tiene diversas …