[PDF][PDF] Convolutional neural networks for small-footprint keyword spotting.

TN Sainath, C Parada - Interspeech, 2015 - isca-archive.org
Abstract We explore using Convolutional Neural Networks (CNNs) for a small-footprint
keyword spotting (KWS) task. CNNs are attractive for KWS since they have been shown to …

Deep neural network approaches to speaker and language recognition

F Richardson, D Reynolds… - IEEE signal processing …, 2015 - ieeexplore.ieee.org
The impressive gains in performance obtained using deep neural networks (DNNs) for
automatic speech recognition (ASR) have motivated the application of DNNs to other …

A review on deep learning approaches in speaker identification

SS Tirumala, SR Shahamiri - … of the 8th international conference on …, 2016 - dl.acm.org
Deep learning (DL) is becoming an increasingly interesting and powerful machine learning
method with successful applications in many domains, such as natural language …

The language-independent bottleneck features

K Veselý, M Karafiát, F Grézl, M Janda… - 2012 IEEE Spoken …, 2012 - ieeexplore.ieee.org
In this paper we present novel language-independent bottleneck (BN) feature extraction
framework. In our experiments we have used Multilingual Artificial Neural Network (ANN) …

A unified deep neural network for speaker and language recognition

F Richardson, D Reynolds, N Dehak - arXiv preprint arXiv:1504.00923, 2015 - arxiv.org
Learned feature representations and sub-phoneme posteriors from Deep Neural Networks
(DNNs) have been used separately to produce significant performance gains for speaker …

Single headed attention based sequence-to-sequence model for state-of-the-art results on switchboard

Z Tüske, G Saon, K Audhkhasi, B Kingsbury - arXiv preprint arXiv …, 2020 - arxiv.org
It is generally believed that direct sequence-to-sequence (seq2seq) speech recognition
models are competitive with hybrid models only when a large amount of data, at least a …

[PDF][PDF] Neural Network Bottleneck Features for Language Identification.

P Matejka, Le Zhang 0002, T Ng, O Glembek, JZ Ma… - Odyssey, 2014 - isca-archive.org
This paper presents the application of Neural Network Bottleneck (BN) features in Language
Identification (LID). BN features are generally used for Large Vocabulary Speech …

Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model

S Shon, H Tang, J Glass - 2018 IEEE Spoken Language …, 2018 - ieeexplore.ieee.org
In this paper, we propose a Convolutional Neural Network (CNN) based speaker recognition
model for extracting robust speaker embeddings. The embedding can be extracted …

A near real-time automatic speaker recognition architecture for voice-based user interface

P Dhakal, P Damacharla, AY Javaid… - Machine learning and …, 2019 - mdpi.com
In this paper, we present a novel pipelined near real-time speaker recognition architecture
that enhances the performance of speaker recognition by exploiting the advantages of …

Multilingually trained bottleneck features in spoken language recognition

R Fer, P Matějka, F Grézl, O Plchot, K Veselý… - Computer Speech & …, 2017 - Elsevier
Multilingual training of neural networks has proven to be simple yet effective way to deal with
multilingual training corpora. It allows to use several resources to jointly train a language …