Audio-visual speech and gesture recognition by sensors of mobile devices

D Ryumin, D Ivanko, E Ryumina - Sensors, 2023 - mdpi.com
Audio-visual speech recognition (AVSR) is one of the most promising solutions for reliable
speech recognition, particularly when audio is corrupted by noise. Additional visual …

Multi-corpus learning for audio–visual emotions and sentiment recognition

E Ryumina, M Markitantov, A Karpov - Mathematics, 2023 - mdpi.com
Recognition of emotions and sentiment (affective states) from human audio–visual
information is widely used in healthcare, education, entertainment, and other fields; …

Deep learning mask face recognition with annealing mechanism

WC Cheng, HC Hsiao, LH Li - Applied Sciences, 2023 - mdpi.com
Face recognition (FR) has matured with deep learning, but due to the COVID-19 epidemic,
people need to wear masks outside to reduce the risk of infection, making FR a challenge …

[PDF][PDF] Multimodal personality traits assessment (MuPTA) corpus: the impact of spontaneous and read speech

E Ryumina, D Ryumin, M Markitantov… - Proceedings of ISCA …, 2023 - isca-archive.org
Automatic personality traits assessment (PTA) provides highlevel, intelligible predictive
inputs for subsequent critical downstream tasks, such as job interview recommendations …

Contrastive Learning-based Chaining-Cluster for Multilingual Voice-Face Association

W Chen, Y Sun, K Xu, Y Dou - arXiv preprint arXiv:2408.02025, 2024 - arxiv.org
The innate correlation between a person's face and voice has recently emerged as a
compelling area of study, especially within the context of multilingual environments. This …

Multimodal prediction of profanity based on speech analysis

I Smirnov, A Laushkina - Procedia Computer Science, 2023 - Elsevier
With increasing multimedia content and social activities, moderation problems increase.
There are different approaches to moderation and automation. However, they have …

Анализ информационного и математического обеспечения для распознавания аффективных состояний человека

АА Двойникова, МВ Маркитантов… - Информатика и …, 2022 - mathnet.ru
В статье представлен аналитический обзор исследований в области аффективных
вычислений. Это направление является составляющей искусственного интеллекта, и …

[PDF][PDF] The MASCFLICHT Corpus: Face Mask Type and Coverage Area Recognition from Speech

A Mallol-Ragolta, N Urbach, S Liu, A Batliner… - researchgate.net
We present a novel speech dataset for face mask type and coverage area recognition
collected with a smartphone. The dataset contains 2 h 27 m 55 s of data from 30 German …

Учредители: Министерство науки и высшего образования РФ

МА ЛЕТЕНКОВ, РН ЯКОВЛЕВ… - ИЗВЕСТИЯ ВЫСШИХ …, 2022 - elibrary.ru
Для решения проблемы автоматического распознавания лиц людей, использующих
такие средства индивидуальной защиты, как медицинская маска, предложен и …

[PDF][PDF] МА ЛЕТЕНКОВ, РН ЯКОВЛЕВ, МВ МАРКИТАНТОВ

ДА РЮМИН, АА КАРПОВ - Изв. вузов, 2022 - pribor.ifmo.ru
Для решения проблемы автоматического распознавания лиц людей, использующих
такие средства индивидуальной защиты, как медицинская маска, предложен и …