查看文章

sciencedirect.com 中的 [HTML]

Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN

作者

Zengwei Yao, Zihao Wang, Weihuang Liu, Yaqian Liu, Jiahui Pan

发表日期

2020/6/1

期刊

Speech Communication

卷号

120

页码范围

11-19

出版商

North-Holland

简介

Speech emotion recognition plays an increasingly important role in emotional computing and is still a challenging task due to its complexity. In this study, we developed a framework integrating three distinctive classifiers: a deep neural network (DNN), a convolution neural network (CNN), and a recurrent neural network (RNN). The framework was used for categorical recognition of four discrete emotions (i.e., angry, happy, neutral and sad). Frame-level low-level descriptors (LLDs), segment-level mel-spectrograms (MS), and utterance-level outputs of high-level statistical functions (HSFs) on LLDs were passed to RNN, CNN, and DNN, separately. Three individual models of LLD-RNN, MS-CNN, and HSF-DNN were obtained. In the models of MS-CNN and LLD-RNN, the attention mechanism based weighted-pooling method was utilized to aggregate the CNN and RNN outputs. To effectively utilize the …

引用总数

被引用次数：138

202020212022202320246 32 33 54 13

学术搜索中的文章

Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN

Z Yao, Z Wang, W Liu, Y Liu, J Pan - Speech Communication, 2020

被引用次数：138 相关文章