Feature Learning in Dynamic Environments: Modeling the Acoustic Structure of Musical Emotion.

J Zhao, X Mao, L Chen - Biomedical signal processing and control, 2019 - Elsevier

We aimed at learning deep emotion features to recognize speech emotion. Two
convolutional neural network and long short-term memory (CNN LSTM) networks, one 1D …

被引用次数：1043 相关文章所有 3 个版本

[PDF] wiley.com Full View

Learning deep features to recognise speech emotion using merged deep CNN

J Zhao, X Mao, L Chen - IET Signal Processing, 2018 - Wiley Online Library

This study aims at learning deep features from different data to recognise speech emotion.
The authors designed a merged convolutional neural network (CNN), which had two …

被引用次数：114 相关文章所有 3 个版本

Exploiting the potentialities of features for speech emotion recognition

D Li, Y Zhou, Z Wang, D Gao - Information Sciences, 2021 - Elsevier

In recent years, studies on speech signals have increasingly paid attention to emotional
information. The most challenging aspect in speech emotion recognition (SER) is choosing …

被引用次数：46 相关文章

[PDF] uni-augsburg.de

Discriminatively trained recurrent neural networks for continuous dimensional emotion recognition from audio

F Weninger, F Ringeval, E Marchi, B Schuller - 2016 - opus.bibliothek.uni-augsburg.de

Continuous dimensional emotion recognition from audio is a sequential regression problem,
where the goal is to maximize correlation between sequences of regression outputs and …

被引用次数：78 相关文章所有 7 个版本

[PDF] tum.de

On-line continuous-time music mood regression with deep recurrent neural networks

F Weninger, F Eyben, B Schuller - 2014 IEEE international …, 2014 - ieeexplore.ieee.org

This paper proposes a novel machine learning approach for the task of on-line continuous-
time music mood regression, ie, low-latency prediction of the time-varying arousal and …

被引用次数：93 相关文章所有 9 个版本

[PDF] gold.ac.uk

[PDF][PDF] Musical audio synthesis using autoencoding neural nets

A Sarroff, MA Casey - … of the International Society for Music …, 2014 - research.gold.ac.uk

With an optimal network topology and tuning of hyperpa-rameters, artificial neural networks
(ANNs) may be trained to learn a mapping from low level audio features to one or more …

被引用次数：59 相关文章所有 11 个版本

[PDF] github.io

A systematic evaluation of the bag-of-frames representation for music information retrieval

L Su, CCM Yeh, JY Liu, JC Wang… - IEEE Transactions on …, 2014 - ieeexplore.ieee.org

There has been an increasing attention on learning feature representations from the
complex, high-dimensional audio data applied in various music information retrieval (MIR) …

被引用次数：67 相关文章所有 7 个版本

[PDF] gla.ac.uk

Understanding affective content of music videos through learned representations

E Acar, F Hopfgartner, S Albayrak - … MMM 2014, Dublin, Ireland, January 6 …, 2014 - Springer

In consideration of the ever-growing available multimedia data, annotating multimedia
content automatically with feeling (s) expected to arise in users is a challenging problem. In …

被引用次数：62 相关文章所有 5 个版本

[PDF] hep.com.cn

Emotion recognition from thermal infrared images using deep Boltzmann machine

S Wang, M He, Z Gao, S He, Q Ji - Frontiers of Computer Science, 2014 - Springer

Facial expression and emotion recognition from thermal infrared images has attracted more
and more attentions in recent years. However, the features adopted in current work are …

被引用次数：53 相关文章所有 7 个版本

[PDF] ieee.org

Recognizing semi-natural and spontaneous speech emotions using deep neural networks

A Amjad, L Khan, N Ashraf, MB Mahmood… - IEEE …, 2022 - ieeexplore.ieee.org

We needed to find deep emotional features to identify emotions from audio signals.
Identifying emotions in spontaneous speech is a novel and challenging subject of research …

被引用次数：13 相关文章所有 6 个版本