CTNet: Conversational transformer network for emotion recognition
Emotion recognition in conversation is a crucial topic for its widespread applications in the
field of human-computer interactions. Unlike vanilla emotion recognition of individual …
field of human-computer interactions. Unlike vanilla emotion recognition of individual …
Multimodal transformer fusion for continuous emotion recognition
Multimodal fusion increases the performance of emotion recognition because of the
complementarity of different modalities. Compared with decision level and feature level …
complementarity of different modalities. Compared with decision level and feature level …
[HTML][HTML] Robust multimodal emotion recognition from conversation with transformer-based crossmodality fusion
Decades of scientific research have been conducted on developing and evaluating methods
for automated emotion recognition. With exponentially growing technology, there is a wide …
for automated emotion recognition. With exponentially growing technology, there is a wide …
Head fusion: Improving the accuracy and robustness of speech emotion recognition on the IEMOCAP and RAVDESS dataset
Speech Emotion Recognition (SER) refers to the use of machines to recognize the emotions
of a speaker from his (or her) speech. SER benefits Human-Computer Interaction (HCI). But …
of a speaker from his (or her) speech. SER benefits Human-Computer Interaction (HCI). But …
Multimodal multi-task learning for dimensional and continuous emotion recognition
Automatic emotion recognition is a challenging task which can make great impact on
improving natural human computer interactions. In this paper, we present our effort for the …
improving natural human computer interactions. In this paper, we present our effort for the …
Do deepfakes feel emotions? A semantic approach to detecting deepfakes via emotional inconsistencies
Recent advances in deep learning and computer vision have spawned a new class of media
forgeries known as deepfakes, which typically consist of artificially generated human faces …
forgeries known as deepfakes, which typically consist of artificially generated human faces …
[HTML][HTML] Data augmentation for audio-visual emotion recognition with an efficient multimodal conditional GAN
Audio-visual emotion recognition is the research of identifying human emotional states by
combining the audio modality and the visual modality simultaneously, which plays an …
combining the audio modality and the visual modality simultaneously, which plays an …
Multi-modal continuous dimensional emotion recognition using recurrent neural network and self-attention mechanism
Automatic perception and understanding of human emotion or sentiment has a wide range
of applications and has attracted increasing attention nowadays. The Multimodal Sentiment …
of applications and has attracted increasing attention nowadays. The Multimodal Sentiment …
Applied Affective Computing
Affective computing is a nascent field situated at the intersection of artificial intelligence with
social and behavioral science. It studies how human emotions are perceived and …
social and behavioral science. It studies how human emotions are perceived and …
Attention-based multi-modal sentiment analysis and emotion detection in conversation using RNN
The availability of an enormous quantity of multimodal data and its widespread applications,
automatic sentiment analysis and emotion classification in the conversation has become an …
automatic sentiment analysis and emotion classification in the conversation has become an …