Annotation Confidence vs. Training Sample Size: Trade-Off Solution for Partially-Continuous...

[HTML][HTML] A review of recent advances on deep learning methods for audio-visual speech recognition

D Ivanko, D Ryumin, A Karpov - Mathematics, 2023 - mdpi.com

This article provides a detailed review of recent advances in audio-visual speech
recognition (AVSR) methods that have been developed over the last decade (2013–2023) …

被引用次数：9 相关文章所有 5 个版本

[HTML] mdpi.com

[HTML][HTML] Audio-visual speech and gesture recognition by sensors of mobile devices

D Ryumin, D Ivanko, E Ryumina - Sensors, 2023 - mdpi.com

Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable
speech recognition, particularly when audio is corrupted by noise. Additional visual …

被引用次数：44 相关文章所有 9 个版本

In search of a robust facial expressions recognition model: A large-scale visual cross-corpus study

E Ryumina, D Dresvyanskiy, A Karpov - Neurocomputing, 2022 - Elsevier

Many researchers have been seeking robust emotion recognition system for already last two
decades. It would advance computer systems to a new level of interaction, providing much …

被引用次数：44 相关文章所有 4 个版本

[PDF] thecvf.com

Zero-Shot Audio-Visual Compound Expression Recognition Method based on Emotion Probability Fusion

E Ryumina, M Markitantov, D Ryumin… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract A Compound Expression Recognition (CER) as a subfield of affective computing is
a novel task in intelligent human-computer interaction and multimodal user interfaces. We …

被引用次数：1 相关文章

[HTML] mdpi.com

[HTML][HTML] Multi-corpus learning for audio–visual emotions and sentiment recognition

E Ryumina, M Markitantov, A Karpov - Mathematics, 2023 - mdpi.com

Recognition of emotions and sentiment (affective states) from human audio–visual
information is widely used in healthcare, education, entertainment, and other fields; …

被引用次数：3 相关文章所有 6 个版本

OCEAN-AI framework with EmoFormer cross-hemiface attention approach for personality traits assessment

E Ryumina, M Markitantov, D Ryumin… - Expert Systems with …, 2024 - Elsevier

Psychological and neurological studies earlier suggested that a personality type can be
determined by the whole face as well as by its sides. This article discusses novel research …

被引用次数：3 相关文章所有 2 个版本

[PDF] isca-archive.org

[PDF][PDF] Biometric Russian Audio-Visual Extended MASKS (BRAVE-MASKS) Corpus: Multimodal Mask Type Recognition Task.

M Markitantov, E Ryumina, D Ryumin, A Karpov - INTERSPEECH, 2022 - isca-archive.org

In this paper, we present a new multimodal corpus called Biometric Russian Audio-Visual
Extended MASKS (BRAVEMASKS), which is designed to analyze voice and facial …

被引用次数：9 相关文章所有 4 个版本

Personalized frame-level facial expression recognition in video

AV Savchenko - International Conference on Pattern Recognition and …, 2022 - Springer

In this paper, the personalization of the video-based frame-level facial expression
recognition is studied for multi-user systems if a small amount of short videos are available …

被引用次数：9 相关文章所有 3 个版本

[HTML] acm.org

Audio-visual continuous recognition of emotional state in a multi-user system based on personalized representation of facial expressions and voice

AV Savchenko, LV Savchenko - Pattern Recognition and Image Analysis, 2022 - Springer

This paper is devoted to tracking dynamics of psycho-emotional state based on analysis of
the user's facial video and voice. We propose a novel technology with personalized acoustic …

被引用次数：5 相关文章所有 3 个版本

[PDF] isca-archive.org

[PDF][PDF] Hybrid Dataset for Speech Emotion Recognition in Russian Language

V Kondratenko, N Karpov, A Sokolov… - Proc. INTERSPEECH …, 2023 - isca-archive.org

We present a new data set for speech emotion recognition (SER) tasks called Dusha. The
corpus contains approximately 350 hours of data, more than 300 000 audio recordings of …

被引用次数：1 相关文章所有 2 个版本