Convolutive bottleneck network features for LVCSR

TN Sainath, C Parada - Interspeech, 2015 - isca-archive.org

Abstract We explore using Convolutional Neural Networks (CNNs) for a small-footprint
keyword spotting (KWS) task. CNNs are attractive for KWS since they have been shown to …

被引用次数：662 相关文章所有 9 个版本

[PDF] mit.edu

Deep neural network approaches to speaker and language recognition

F Richardson, D Reynolds… - IEEE signal processing …, 2015 - ieeexplore.ieee.org

The impressive gains in performance obtained using deep neural networks (DNNs) for
automatic speech recognition (ASR) have motivated the application of DNNs to other …

被引用次数：564 相关文章所有 6 个版本

[PDF] researchgate.net

A review on deep learning approaches in speaker identification

SS Tirumala, SR Shahamiri - … of the 8th international conference on …, 2016 - dl.acm.org

Deep learning (DL) is becoming an increasingly interesting and powerful machine learning
method with successful applications in many domains, such as natural language …

被引用次数：64 相关文章所有 2 个版本

[PDF] academia.edu

The language-independent bottleneck features

K Veselý, M Karafiát, F Grézl, M Janda… - 2012 IEEE Spoken …, 2012 - ieeexplore.ieee.org

In this paper we present novel language-independent bottleneck (BN) feature extraction
framework. In our experiments we have used Multilingual Artificial Neural Network (ANN) …

被引用次数：255 相关文章所有 8 个版本

[PDF] arxiv.org

A unified deep neural network for speaker and language recognition

F Richardson, D Reynolds, N Dehak - arXiv preprint arXiv:1504.00923, 2015 - arxiv.org

Learned feature representations and sub-phoneme posteriors from Deep Neural Networks
(DNNs) have been used separately to produce significant performance gains for speaker …

被引用次数：209 相关文章所有 11 个版本

[PDF] arxiv.org

Single headed attention based sequence-to-sequence model for state-of-the-art results on switchboard

Z Tüske, G Saon, K Audhkhasi, B Kingsbury - arXiv preprint arXiv …, 2020 - arxiv.org

It is generally believed that direct sequence-to-sequence (seq2seq) speech recognition
models are competitive with hybrid models only when a large amount of data, at least a …

被引用次数：82 相关文章所有 8 个版本

[PDF] isca-archive.org

[PDF][PDF] Neural Network Bottleneck Features for Language Identification.

P Matejka, Le Zhang 0002, T Ng, O Glembek, JZ Ma… - Odyssey, 2014 - isca-archive.org

This paper presents the application of Neural Network Bottleneck (BN) features in Language
Identification (LID). BN features are generally used for Large Vocabulary Speech …

被引用次数：174 相关文章所有 10 个版本

[PDF] arxiv.org

Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model

S Shon, H Tang, J Glass - 2018 IEEE Spoken Language …, 2018 - ieeexplore.ieee.org

In this paper, we propose a Convolutional Neural Network (CNN) based speaker recognition
model for extracting robust speaker embeddings. The embedding can be extracted …

被引用次数：103 相关文章所有 11 个版本

[PDF] mdpi.com

A near real-time automatic speaker recognition architecture for voice-based user interface

P Dhakal, P Damacharla, AY Javaid… - Machine learning and …, 2019 - mdpi.com

In this paper, we present a novel pipelined near real-time speaker recognition architecture
that enhances the performance of speaker recognition by exploiting the advantages of …

被引用次数：88 相关文章所有 4 个版本

[PDF] vut.cz

Multilingually trained bottleneck features in spoken language recognition

R Fer, P Matějka, F Grézl, O Plchot, K Veselý… - Computer Speech & …, 2017 - Elsevier

Multilingual training of neural networks has proven to be simple yet effective way to deal with
multilingual training corpora. It allows to use several resources to jointly train a language …

被引用次数：86 相关文章所有 4 个版本