Social signal processing: Survey of an emerging domain

A Vinciarelli, M Pantic, H Bourlard - Image and vision computing, 2009 - Elsevier
The ability to understand and manage social signals of a person we are communicating with
is the core of social intelligence. Social intelligence is a facet of human intelligence that has …

Social signal processing: state-of-the-art and future perspectives of an emerging domain

A Vinciarelli, M Pantic, H Bourlard… - Proceedings of the 16th …, 2008 - dl.acm.org
The ability to understand and manage social signals of a person we are communicating with
is the core of social intelligence. Social intelligence is a facet of human intelligence that has …

[PDF][PDF] Spontaneous speech: how people really talk and why engineers should care.

E Shriberg - INTERSPEECH, 2005 - Citeseer
Spontaneous conversation is optimized for human-human communication, but differs in
some important ways from the types of speech for which human language technology is …

[PDF][PDF] Noisy BiLSTM-Based Models for Disfluency Detection.

N Bach, F Huang - Interspeech, 2019 - isca-archive.org
This paper describes BiLSTM-based models to disfluency detection in speech transcripts
using residual BiLSTM blocks, self-attention, and noisy training approach. Our best model …

A survey of automatic Arabic diacritization techniques

AM Azmi, RS Almajed - Natural Language Engineering, 2015 - cambridge.org
In Modern Standard Arabic texts are typically written without diacritical markings. The
diacritics are important to clarify the sense and meaning of words. Lack of these markings …

Enhancing asr for stuttered speech with limited data using detect and pass

O Shonibare, X Tong, V Ravichandran - arXiv preprint arXiv:2202.05396, 2022 - arxiv.org
It is estimated that around 70 million people worldwide are affected by a speech disorder
called stuttering. With recent advances in Automatic Speech Recognition (ASR), voice …

Improved robustness to disfluencies in rnn-transducer based speech recognition

V Mendelev, T Raissi, G Camporese… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR) based on Recurrent Neural Network Transducers
(RNN-T) is gaining interest in the speech community. We investigate data selection and …

Sequence labeling to detect stuttering events in read speech

S Alharbi, M Hasan, AJH Simons, S Brumfitt… - Computer Speech & …, 2020 - Elsevier
Stuttering is a speech disorder that, if treated during childhood, may be prevented from
persisting into adolescence. A clinician must first determine the severity of stuttering …

Two-stage hidden markov model in gesture recognition for human robot interaction

N Nguyen-Duc-Thanh, S Lee… - International Journal of …, 2012 - journals.sagepub.com
Hidden Markov Model (HMM) is very rich in mathematical structure and hence can form the
theoretical basis for use in a wide range of applications including gesture representation …

Automatic evaluation of reading aloud performance in children

J Proença, C Lopes, M Tjalve, A Stolcke… - Speech …, 2017 - Elsevier
Evaluating children's reading aloud proficiency is typically a task done by teachers on an
individual basis, where reading time and wrong words are marked manually. A …