A comprehensive survey on support vector machine classification: Applications, challenges and trends

J Cervantes, F Garcia-Lamont, L Rodríguez-Mazahua… - Neurocomputing, 2020 - Elsevier
In recent years, an enormous amount of research has been carried out on support vector
machines (SVMs) and their application in several fields of science. SVMs are one of the …

Emotion recognition using multi-modal data and machine learning techniques: A tutorial and review

J Zhang, Z Yin, P Chen, S Nichele - Information Fusion, 2020 - Elsevier
In recent years, the rapid advances in machine learning (ML) and information fusion has
made it possible to endow machines/computers with the ability of emotion understanding …

Attention bottlenecks for multimodal fusion

A Nagrani, S Yang, A Arnab, A Jansen… - Advances in neural …, 2021 - proceedings.neurips.cc
Humans perceive the world by concurrently processing and fusing high-dimensional inputs
from multiple modalities such as vision and audio. Machine perception models, in stark …

Robust speech emotion recognition using CNN+ LSTM based on stochastic fractal search optimization algorithm

AA Abdelhamid, ESM El-Kenawy, B Alotaibi… - Ieee …, 2022 - ieeexplore.ieee.org
One of the main challenges facing the current approaches of speech emotion recognition is
the lack of a dataset large enough to train the currently available deep learning models …

Speech emotion recognition with deep convolutional neural networks

D Issa, MF Demirci, A Yazici - Biomedical Signal Processing and Control, 2020 - Elsevier
The speech emotion recognition (or, classification) is one of the most challenging topics in
data science. In this work, we introduce a new architecture, which extracts mel-frequency …

Speech recognition using deep neural networks: A systematic review

AB Nassif, I Shahin, I Attili, M Azzeh, K Shaalan - IEEE access, 2019 - ieeexplore.ieee.org
Over the past decades, a tremendous amount of research has been done on the use of
machine learning for speech processing applications, especially speech recognition …

Speech emotion recognition using deep 1D & 2D CNN LSTM networks

J Zhao, X Mao, L Chen - Biomedical signal processing and control, 2019 - Elsevier
We aimed at learning deep emotion features to recognize speech emotion. Two
convolutional neural network and long short-term memory (CNN LSTM) networks, one 1D …

Mavil: Masked audio-video learners

PY Huang, V Sharma, H Xu, C Ryali… - Advances in …, 2024 - proceedings.neurips.cc
Abstract We present Masked Audio-Video Learners (MAViL) to learn audio-visual
representations with three complementary forms of self-supervision:(1) reconstructing …

Deep learning for human affect recognition: Insights and new developments

PV Rouast, MTP Adam, R Chiong - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Automatic human affect recognition is a key step towards more natural human-computer
interaction. Recent trends include recognition in the wild using a fusion of audiovisual and …

Deep multimodal representation learning: A survey

W Guo, J Wang, S Wang - Ieee Access, 2019 - ieeexplore.ieee.org
Multimodal representation learning, which aims to narrow the heterogeneity gap among
different modalities, plays an indispensable role in the utilization of ubiquitous multimodal …