A review of recent advances in visual speech decoding

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

Sub-word level lip reading with visual attention

KR Prajwal, T Afouras… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
The goal of this paper is to learn strong lip reading models that can recognise speech in
silent videos. Most prior works deal with the open-set visual speech recognition problem by …

Detecting offensive language in social media to protect adolescent online safety

Y Chen, Y Zhou, S Zhu, H Xu - … security, risk and trust and 2012 …, 2012 - ieeexplore.ieee.org
Since the textual contents on online social media are highly unstructured, informal, and often
misspelled, existing research on message-level offensive language detection cannot …

The TORGO database of acoustic and articulatory speech from speakers with dysarthria

F Rudzicz, AK Namasivayam, T Wolff - Language resources and …, 2012 - Springer
This paper describes the acquisition of a new database of dysarthric speech in terms of
aligned acoustics and articulatory data. This database currently includes data from seven …

Context-aware driver behavior detection system in intelligent transportation systems

S Al-Sultan, AH Al-Bayatti… - IEEE transactions on …, 2013 - ieeexplore.ieee.org
Vehicular ad hoc networks (VANETs) have emerged as an application of mobile ad hoc
networks (MANETs), which use dedicated short-range communication (DSRC) to allow …

Subword modeling for automatic speech recognition: Past, present, and emerging approaches

K Livescu, E Fosler-Lussier… - IEEE Signal Processing …, 2012 - ieeexplore.ieee.org
Modern automatic speech recognition systems handle large vocabularies of words, making
it infeasible to collect enough repetitions of each word to train individual word models …

Adapt-and-adjust: Overcoming the long-tail problem of multilingual speech recognition

GI Winata, G Wang, C Xiong, S Hoi - arXiv preprint arXiv:2012.01687, 2020 - arxiv.org
One crucial challenge of real-world multilingual speech recognition is the long-tailed
distribution problem, where some resource-rich languages like English have abundant …

Selftalk: A self-supervised commutative training diagram to comprehend 3d talking faces

Z Peng, Y Luo, Y Shi, H Xu, X Zhu, H Liu, J He… - Proceedings of the 31st …, 2023 - dl.acm.org
Speech-driven 3D face animation technique, extending its applications to various
multimedia fields. Previous research has generated promising realistic lip movements and …

Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains

R Arora, K Livescu - 2013 IEEE International Conference on …, 2013 - ieeexplore.ieee.org
Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised
learning of acoustic features when a second view (eg, articulatory measurements) is …

[HTML][HTML] Perception and hierarchical dynamics

SJ Kiebel, J Daunizeau, KJ Friston - Frontiers in neuroinformatics, 2009 - frontiersin.org
In this paper, we suggest that perception could be modeled by assuming that sensory input
is generated by a hierarchy of attractors in a dynamic system. We describe a mathematical …