An integrated framework for multi-channel multi-source localization and voice activity detection

VP Minotto, CR Jung, B Lee - IEEE Transactions on Multimedia, 2015 - ieeexplore.ieee.org

Speaker diarization (SD) is the process of assigning speech segments of an audio stream to
its corresponding speakers, thus comprising the problem of voice activity detection (VAD) …

被引用次数：48 相关文章所有 4 个版本

[PDF] epfl.ch

Structured sparsity models for reverberant speech separation

A Asaei, M Golbabaee, H Bourlard… - IEEE/ACM Transactions …, 2014 - ieeexplore.ieee.org

We tackle the speech separation problem through modeling the acoustics of the reverberant
chambers. Our approach exploits structured sparsity models to perform speech recovery and …

被引用次数：61 相关文章所有 7 个版本

[PDF] researchgate.net

Simultaneous-speaker voice activity detection and localization using mid-fusion of SVM and HMMs

VP Minotto, CR Jung, B Lee - IEEE Transactions on Multimedia, 2014 - ieeexplore.ieee.org

Humans can extract speech signals that they need to understand from a mixture of
background noise, interfering sound sources, and reverberation for effective communication …

被引用次数：42 相关文章所有 8 个版本

[PDF] arxiv.org

Ad hoc microphone array calibration: Euclidean distance matrix completion algorithm and theoretical guarantees

MJ Taghizadeh, R Parhizkar, PN Garner, H Bourlard… - Signal Processing, 2015 - Elsevier

This paper addresses the problem of ad hoc microphone array calibration where only partial
information about the distances between microphones is available. We construct a matrix …

被引用次数：36 相关文章所有 15 个版本

[PDF] arxiv.org

Steered Response Power for Sound Source Localization: A Tutorial Review

E Grinstein, E Tengan, B Çakmak, T Dietzen… - arXiv preprint arXiv …, 2024 - arxiv.org

In the last three decades, the Steered Response Power (SRP) method has been widely
used for the task of Sound Source Localization (SSL), due to its satisfactory localization …

Novel GCC-PHAT model in diffuse sound field for microphone array pairwise distance based calibration

J Velasco, MJ Taghizadeh, A Asaei… - … , Speech and Signal …, 2015 - ieeexplore.ieee.org

We propose a novel formulation of the generalized cross correlation with phase transform
(GCC-PHAT) for a pair of microphones in diffuse sound field. This formulation elucidates the …

被引用次数：28 相关文章所有 11 个版本

[PDF] epfl.ch

Model-based sparse component analysis for reverberant speech localization

A Asaei, H Bourlard, MJ Taghizadeh… - … on Acoustics, Speech …, 2014 - ieeexplore.ieee.org

In this paper, the problem of multiple speaker localization via speech separation based on
model-based sparse recovery is studies. We compare and contrast computational sparse …

被引用次数：36 相关文章所有 12 个版本

[PDF] whiterose.ac.uk

Detection of activity and position of speakers by using deep neural networks and acoustic data augmentation

P Vecchiotti, G Pepe, E Principi, S Squartini - Expert Systems with …, 2019 - Elsevier

Abstract The task of Speaker LOCalization (SLOC) has been the focus of numerous works in
the research field, where SLOC is performed on pure speech data, requiring the presence of …

被引用次数：19 相关文章所有 7 个版本

Multi-source direction-of-arrival estimation using steered response power and group-sparse optimization

E Tengan, T Dietzen, F Elvander… - … /ACM Transactions on …, 2024 - ieeexplore.ieee.org

In this paper, a method is proposed for estimating the direction of arrival (DOA) of multiple
broadband sound sources. This is achieved by solving a group-sparse optimization …

Robust distributed multi-speaker voice activity detection using stability selection for sparse non-negative feature extraction

LK Hamaidi, M Muma, AM Zoubir - 2017 25th European signal …, 2017 - ieeexplore.ieee.org

In this paper, we propose a robust multi-speaker voice activity detection approach for
wireless acoustic sensor networks (WASN). Each node of the WASN receives a mixture of …

被引用次数：16 相关文章所有 5 个版本