[PDF][PDF] Using the fisher vector representation for audio-based emotion recognition

G Gosztolya - Acta Polytechnica Hungarica, 2020 - real.mtak.hu
Automatically determining speaker emotions in human speech is a frequently studied task,
where various techniques have been employed over the years. An efficient method is to …

Language independent automatic speech segmentation into phoneme-like units on the base of acoustic distinctive features

G Kiss, D Sztahó, K Vicsi - 2013 IEEE 4th international …, 2013 - ieeexplore.ieee.org
There are special topics in cognitive infocommunications where the processing of
continuous speech is necessary. These topics often require the segmentation of speech …

Posterior-thresholding feature extraction for paralinguistic speech classification

G Gosztolya - Knowledge-Based Systems, 2019 - Elsevier
The standard approach for handling computational paralinguistic speech tasks is to extract
several thousand utterance-level features from the speech excerpts, and use machine …

Ensemble Bag-of-Audio-Words representation improves paralinguistic classification accuracy

G Gosztolya, R Busa-Fekete - IEEE/ACM Transactions on …, 2020 - ieeexplore.ieee.org
A recently introduced, effective feature extraction technique for computational paralinguistics
is that of Bag-of-Audio-Words (BoAW), where we cluster the frame-level training vectors, and …

Speech activity detection and automatic prosodic processing unit segmentation for emotion recognition

D Sztahó, K Vicsi - Intelligent Decision Technologies, 2014 - content.iospress.com
In speech communication emotions play a great role in expressing information. These
emotions are partly given as reactions to our environment, to our partners during a …

[PDF][PDF] Érzelmek felismerése magyar nyelvű hangfelvételekből akusztikus szózsák jellemzőreprezentáció alkalmazásával

M Vetráb, G Gosztolya - 2019 - inf.u-szeged.hu
Kivonat Az érzelmek felismerése a beszédtechnika egy jelenleg is aktívan kutatott területe. A
feladaton belül számos probléma fogalmazódott már meg; ezek egyike az egyes …

[PDF][PDF] Using the Bag-of-Audio-Words approach for emotion recognition

M Vetráb, G Gosztolya - Acta Universitatis Sapientiae …, 2022 - intapi.sciendo.com
The problem of varying length recordings is a well-known issue in paralinguistics. We
investigated how to resolve this problem using the bag-of-audio-words feature extraction …

Unsupervised phoneme segmentation based on main energy change for arabic speech

N Lachachi - Journal of Telecommunications and Information …, 2017 - yadda.icm.edu.pl
In this paper, a new method for segmenting speech at the phoneme level is presented. For
this purpose, author uses the short-time Fourier transform of the speech signal. The goal is …

An automatic motion generating system for humanoid robots with emotions

T TAKEGOSHI, M HAGIWARA - Transactions of Japan Society of …, 2018 - jstage.jst.go.jp
This paper proposes a motion generating system that determines the parameters to the
reference motion to generate motions for each emotion. In the proposed system, first the …

Thinking about the present and future of the complex speech recognition

K Vicsi - 2012 IEEE 3rd International Conference on Cognitive …, 2012 - ieeexplore.ieee.org
A critical point of the most cognitive info-communication systems is the state of the
development of speech recognition technology. The paper gives a short introduction of the …