查看文章

jhu.edu 中的 [PDF]

Sparse coding for speech recognition

作者

Garimella SVS Sivaram, Sridhar Krishna Nemala, Mounya Elhilali, Trac D Tran, Hynek Hermansky

发表日期

2010/3/14

研讨会论文

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

页码范围

4346-4349

出版商

IEEE

简介

This paper proposes a novel feature extraction technique for speech recognition based on the principles of sparse coding. The idea is to express a spectro-temporal pattern of speech as a linear combination of an overcomplete set of basis functions such that the weights of the linear combination are sparse. These weights (features) are subsequently used for acoustic modeling. We learn a set of overcomplete basis functions (dictionary) from the training set by adopting a previously proposed algorithm which iteratively minimizes the reconstruction error and maximizes the sparsity of weights. Furthermore, features are derived using the learned basis functions by applying the well established principles of compressive sensing. Phoneme recognition experiments show that the proposed features outperform the conventional features in both clean and noisy conditions.

引用总数

被引用次数：94

2010201120122013201420152016201720182019202020212022202320243 15 7 7 9 10 10 6 7 5 3 5 4 3

学术搜索中的文章

Sparse coding for speech recognition

GSVS Sivaram, SK Nemala, M Elhilali, TD Tran… - 2010 IEEE International Conference on Acoustics …, 2010

被引用次数：94 相关文章所有 11 个版本