Deep learning for robust feature generation in audiovisual emotion recognition

A comprehensive survey on support vector machine classification: Applications, challenges and trends

J Cervantes, F Garcia-Lamont, L Rodríguez-Mazahua… - Neurocomputing, 2020 - Elsevier

In recent years, an enormous amount of research has been carried out on support vector
machines (SVMs) and their application in several fields of science. SVMs are one of the …

被引用次数：1478 相关文章所有 3 个版本

[PDF] google.com

Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review

J Zhang, Z Yin, P Chen, S Nichele - Information Fusion, 2020 - Elsevier

In recent years, the rapid advances in machine learning (ML) and information fusion has
made it possible to endow machines/computers with the ability of emotion understanding …

被引用次数：581 相关文章所有 4 个版本

[PDF] neurips.cc

Attention bottlenecks for multimodal fusion

A Nagrani, S Yang, A Arnab, A Jansen… - Advances in neural …, 2021 - proceedings.neurips.cc

Humans perceive the world by concurrently processing and fusing high-dimensional inputs
from multiple modalities such as vision and audio. Machine perception models, in stark …

被引用次数：519 相关文章所有 8 个版本

[PDF] ieee.org

Robust speech emotion recognition using CNN+ LSTM based on stochastic fractal search optimization algorithm

AA Abdelhamid, ESM El-Kenawy, B Alotaibi… - Ieee …, 2022 - ieeexplore.ieee.org

One of the main challenges facing the current approaches of speech emotion recognition is
the lack of a dataset large enough to train the currently available deep learning models …

被引用次数：111 相关文章所有 7 个版本

Speech emotion recognition with deep convolutional neural networks

D Issa, MF Demirci, A Yazici - Biomedical Signal Processing and Control, 2020 - Elsevier

The speech emotion recognition (or, classification) is one of the most challenging topics in
data science. In this work, we introduce a new architecture, which extracts mel-frequency …

被引用次数：458 相关文章所有 5 个版本

[PDF] ieee.org

Speech recognition using deep neural networks: A systematic review

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org

Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

被引用次数：1185 相关文章所有 9 个版本

[PDF] academia.edu

Speech emotion recognition using deep 1D & 2D CNN LSTM networks

J Zhao, X Mao, L Chen - Biomedical signal processing and control, 2019 - Elsevier

We aimed at learning deep emotion features to recognize speech emotion. Two
convolutional neural network and long short-term memory (CNN LSTM) networks, one 1D …

被引用次数：1001 相关文章所有 3 个版本

[PDF] neurips.cc

Mavil: Masked audio-video learners

PY Huang, V Sharma, H Xu, C Ryali… - Advances in …, 2024 - proceedings.neurips.cc

Abstract We present Masked Audio-Video Learners (MAViL) to learn audio-visual
representations with three complementary forms of self-supervision:(1) reconstructing …

被引用次数：41 相关文章所有 6 个版本

[PDF] arxiv.org

Deep learning for human affect recognition: Insights and new developments

PV Rouast, MTP Adam, R Chiong - IEEE Transactions on …, 2019 - ieeexplore.ieee.org

Automatic human affect recognition is a key step towards more natural human-computer
interaction. Recent trends include recognition in the wild using a fusion of audiovisual and …

被引用次数：599 相关文章所有 6 个版本

[PDF] ieee.org

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org

Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …

被引用次数：441 相关文章所有 4 个版本