Multimodal machine learning: A survey and taxonomy
T Baltrušaitis, C Ahuja… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …
odors, and taste flavors. Modality refers to the way in which something happens or is …
A review of affective computing: From unimodal analysis to multimodal fusion
Affective computing is an emerging interdisciplinary research field bringing together
researchers and practitioners from various fields, ranging from artificial intelligence, natural …
researchers and practitioners from various fields, ranging from artificial intelligence, natural …
Trends in audio signal feature extraction methods
G Sharma, K Umapathy, S Krishnan - Applied Acoustics, 2020 - Elsevier
Audio signal processing algorithms generally involves analysis of signal, extracting its
properties, predicting its behaviour, recognizing if any pattern is present in the signal, and …
properties, predicting its behaviour, recognizing if any pattern is present in the signal, and …
CTNet: Conversational transformer network for emotion recognition
Emotion recognition in conversation is a crucial topic for its widespread applications in the
field of human-computer interactions. Unlike vanilla emotion recognition of individual …
field of human-computer interactions. Unlike vanilla emotion recognition of individual …
Tensor fusion network for multimodal sentiment analysis
Multimodal sentiment analysis is an increasingly popular research area, which extends the
conventional language-based definition of sentiment analysis to a multimodal setup where …
conventional language-based definition of sentiment analysis to a multimodal setup where …
Multimodal co-learning: Challenges, applications with datasets, recent advances and future directions
Multimodal deep learning systems that employ multiple modalities like text, image, audio,
video, etc., are showing better performance than individual modalities (ie, unimodal) …
video, etc., are showing better performance than individual modalities (ie, unimodal) …
Automatic analysis of facial affect: A survey of registration, representation, and recognition
E Sariyanidi, H Gunes… - IEEE transactions on …, 2014 - ieeexplore.ieee.org
Automatic affect analysis has attracted great interest in various contexts including the
recognition of action units and basic or non-basic emotions. In spite of major efforts, there …
recognition of action units and basic or non-basic emotions. In spite of major efforts, there …
A review and meta-analysis of multimodal affect detection systems
SK D'mello, J Kory - ACM computing surveys (CSUR), 2015 - dl.acm.org
Affect detection is an important pattern recognition problem that has inspired researchers
from several areas. The field is in need of a systematic review due to the recent influx of …
from several areas. The field is in need of a systematic review due to the recent influx of …
Learning affective features with a hybrid deep model for audio–visual emotion recognition
Emotion recognition is challenging due to the emotional gap between emotions and audio-
visual features. Motivated by the powerful feature learning ability of deep neural networks …
visual features. Motivated by the powerful feature learning ability of deep neural networks …
A survey of speech emotion recognition in natural environment
While speech emotion recognition (SER) has been an active research field since the last
three decades, the techniques that deal with the natural environment have only emerged in …
three decades, the techniques that deal with the natural environment have only emerged in …