Multimodal multi-channel on-line speaker diarization using sensor fusion through SVM

VP Minotto, CR Jung, B Lee - IEEE Transactions on Multimedia, 2015 - ieeexplore.ieee.org
Speaker diarization (SD) is the process of assigning speech segments of an audio stream to
its corresponding speakers, thus comprising the problem of voice activity detection (VAD) …

Structured sparsity models for reverberant speech separation

A Asaei, M Golbabaee, H Bourlard… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org
We tackle the speech separation problem through modeling the acoustics of the reverberant
chambers. Our approach exploits structured sparsity models to perform speech recovery and …

Simultaneous-speaker voice activity detection and localization using mid-fusion of SVM and HMMs

VP Minotto, CR Jung, B Lee - IEEE Transactions on Multimedia, 2014 - ieeexplore.ieee.org
Humans can extract speech signals that they need to understand from a mixture of
background noise, interfering sound sources, and reverberation for effective communication …

Ad hoc microphone array calibration: Euclidean distance matrix completion algorithm and theoretical guarantees

MJ Taghizadeh, R Parhizkar, PN Garner, H Bourlard… - Signal Processing, 2015 - Elsevier
This paper addresses the problem of ad hoc microphone array calibration where only partial
information about the distances between microphones is available. We construct a matrix …

Steered Response Power for Sound Source Localization: A Tutorial Review

E Grinstein, E Tengan, B Çakmak, T Dietzen… - arXiv preprint arXiv …, 2024 - arxiv.org
In the last three decades, the Steered Response Power (SRP) method has been widely
used for the task of Sound Source Localization (SSL), due to its satisfactory localization …

Novel GCC-PHAT model in diffuse sound field for microphone array pairwise distance based calibration

J Velasco, MJ Taghizadeh, A Asaei… - … , Speech and Signal …, 2015 - ieeexplore.ieee.org
We propose a novel formulation of the generalized cross correlation with phase transform
(GCC-PHAT) for a pair of microphones in diffuse sound field. This formulation elucidates the …

Model-based sparse component analysis for reverberant speech localization

A Asaei, H Bourlard, MJ Taghizadeh… - … on Acoustics, Speech …, 2014 - ieeexplore.ieee.org
In this paper, the problem of multiple speaker localization via speech separation based on
model-based sparse recovery is studies. We compare and contrast computational sparse …

Detection of activity and position of speakers by using deep neural networks and acoustic data augmentation

P Vecchiotti, G Pepe, E Principi, S Squartini - Expert Systems with …, 2019 - Elsevier
Abstract The task of Speaker LOCalization (SLOC) has been the focus of numerous works in
the research field, where SLOC is performed on pure speech data, requiring the presence of …

Multi-source direction-of-arrival estimation using steered response power and group-sparse optimization

E Tengan, T Dietzen, F Elvander… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org
In this paper, a method is proposed for estimating the direction of arrival (DOA) of multiple
broadband sound sources. This is achieved by solving a group-sparse optimization …

Robust distributed multi-speaker voice activity detection using stability selection for sparse non-negative feature extraction

LK Hamaidi, M Muma, AM Zoubir - 2017 25th European signal …, 2017 - ieeexplore.ieee.org
In this paper, we propose a robust multi-speaker voice activity detection approach for
wireless acoustic sensor networks (WASN). Each node of the WASN receives a mixture of …