The cone of silence: Speech separation by localization

T Jenrungrot, V Jayaram, S Seitz… - Advances in …, 2020 - proceedings.neurips.cc
Given a multi-microphone recording of an unknown number of speakers talking
concurrently, we simultaneously localize the sources and separate the individual speakers …

Unsupervised speech enhancement based on multichannel NMF-informed beamforming for noise-robust automatic speech recognition

K Shimada, Y Bando, M Mimura… - … on Audio, Speech …, 2019 - ieeexplore.ieee.org
This paper describes multichannel speech enhancement for improving automatic speech
recognition (ASR) in noisy environments. Recently, the minimum variance distortionless …

Semi-supervised multichannel speech enhancement with a deep speech prior

K Sekiguchi, Y Bando, AA Nugraha… - … on Audio, Speech …, 2019 - ieeexplore.ieee.org
This paper describes a semi-supervised multichannel speech enhancement method that
uses clean speech data for prior training. Although multichannel nonnegative matrix …

Bayesian multichannel speech enhancement with a deep speech prior

K Sekiguchi, Y Bando, K Yoshii… - 2018 Asia-Pacific …, 2018 - ieeexplore.ieee.org
This paper describes statistical multichannel speech enhancement based on a deep
generative model of speech spectra. Recently, deep neural networks (DNNs) have widely …

Minimum-volume multichannel nonnegative matrix factorization for blind audio source separation

J Wang, S Guan, S Liu, XL Zhang - IEEE/ACM Transactions on …, 2021 - ieeexplore.ieee.org
Multichannel blind audio source separation aims to recover the latent sources from their
multichannel mixtures without supervised information. One state-of-the-art blind audio …

Sound source localization using relative harmonic coefficients in modal domain

Y Hu, PN Samarasinghe… - 2019 IEEE Workshop on …, 2019 - ieeexplore.ieee.org
This paper proposes a data-driven source localization approach under a noisy and
reverberant environment, using a newly defined feature named relative harmonic …

ICA and IVA bounded multivariate generalized Gaussian mixture based hidden Markov models

AH Al-gumaei, M Azam, M Amayri… - Engineering Applications of …, 2023 - Elsevier
Abstract Machine learning (ML), a branch of artificial intelligence (AI), is an area of
computational science that is concerned with the analysis and interpretation of patterns and …

Deep ad-hoc beamforming based on speaker extraction for target-dependent speech separation

Z Yang, S Guan, XL Zhang - Speech Communication, 2022 - Elsevier
Recently, the research on ad-hoc microphone arrays with deep learning has drawn much
attention, especially in speech enhancement and separation. An ad-hoc microphone array …

Synchronization of microphones based on rank minimization of warped spectrum for asynchronous distributed recording

K Itoyama, K Nakadai - 2020 IEEE/RSJ International …, 2020 - ieeexplore.ieee.org
This paper describes a new method for synchronizing microphones based on spectral
warping in an asynchronous microphone array. In an audio signal observed by an …

Single Channel multi-speaker speech Separation based on quantized ratio mask and residual network

S Ke, R Hu, X Wang, T Wu, G Li, Z Wang - Multimedia Tools and …, 2020 - Springer
The recently-proposed deep clustering-based algorithms represent a fundamental advance
towards the single-channel multi-speaker speech sep-aration problem. These methods use …