Active Speaker Detection using Audio, Visual and Depth Modalities: A Survey

SNAM Robi, MAZM Ariffin, MAM Izhar, N Ahmad… - IEEE …, 2024 - ieeexplore.ieee.org
The rapid progress of multimodal signal processing in recent years has cleared the way for
novel applications in human-computer interaction, surveillance, and telecommunication …

Multi-Modal Speaker Tracking Using Audio-Assisted Azure Kinect Body Tracking

MAZBM Ariffin, SNABM Robi… - … on Smart Sensors …, 2024 - ieeexplore.ieee.org
Equipped with various sensors, such as a color camera, a depth camera, and microphone
arrays, the Azure Kinect can potentially be used in multi-modal speaker tracking. By …