An overview of speech endpoint detection algorithms

T Zhang, Y Shao, Y Wu, Y Geng, L Fan - Applied Acoustics, 2020 - Elsevier
Speech endpoint detection is an important part of modern speech information processing
technology. The success of endpoint detection directly improves the performance and …

Voice activity detection based on multiple statistical models

JH Chang, NS Kim, SK Mitra - IEEE Transactions on Signal …, 2006 - ieeexplore.ieee.org
One of the key issues in practical speech processing is to achieve robust voice activity
detection (VAD) against the background noise. Most of the statistical model-based …

Voice activity detection in nonstationary noise

SG Tanyer, H Ozer - IEEE Transactions on speech and audio …, 2000 - ieeexplore.ieee.org
A new fusion method for voice activity detection in additive nonstationary noise is suggested.
A performance study of the methods: fusion, the geometrically adaptive energy level …

Robust endpoint detection and energy normalization for real-time speech and speaker recognition

Q Li, J Zheng, A Tsai, Q Zhou - IEEE Transactions on Speech …, 2002 - ieeexplore.ieee.org
When automatic speech recognition (ASR) and speaker verification (SV) are applied in
adverse acoustic environments, endpoint detection and energy normalization can be crucial …

Voice activity detection in the wild: A data-driven approach using teacher-student training

H Dinkel, S Wang, X Xu, M Wu… - IEEE/ACM Transactions …, 2021 - ieeexplore.ieee.org
Voice activity detection is an essential pre-processing component for speech-related tasks
such as automatic speech recognition (ASR). Traditional supervised VAD systems obtain …

Enhanced blind source separation algorithm for highly correlated mixtures

S Wang, D Ramakrishnan, S Gupta… - US Patent 8,223,988, 2012 - Google Patents
An enhanced blind source separation technique is provided to improve separation of highly
correlated signal mixtures. A beamforming algorithm is used to precondition correlated first …

Audio replay spoof attack detection by joint segment-based linear filter bank feature extraction and attention-enhanced DenseNet-BiLSTM network

L Huang, CM Pun - IEEE/ACM Transactions on Audio, Speech …, 2020 - ieeexplore.ieee.org
Most automatic speaker verification (ASV) systems are vulnerable to various spoofing
attacks. In recent years, there have been many methods were proposed for detecting …

Apparatus and method of noise and echo reduction in multiple microphone audio systems

S Wang, SK Gupta, ELT Choy - US Patent 8,175,871, 2012 - Google Patents
GOL 9/00(2006.01) iety of noise Suppressi hniq dapp GOL 7/00(2006.01) that can be
selectively applied to signals received using mul H04B I5/00(2006.01) tiple microphones …

A comparative study of robustness of deep learning approaches for VAD

S Tong, H Gu, K Yu - 2016 IEEE International Conference on …, 2016 - ieeexplore.ieee.org
Voice activity detection (VAD) is an important step for real-world automatic speech
recognition (ASR) systems. Deep learning approaches, such as DNN, RNN or CNN, have …

An end-to-end architecture for keyword spotting and voice activity detection

C Lengerich, A Hannun - arXiv preprint arXiv:1611.09405, 2016 - arxiv.org
We propose a single neural network architecture for two tasks: on-line keyword spotting and
voice activity detection. We develop novel inference algorithms for an end-to-end Recurrent …