A review of recent advances in visual speech decoding
Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …
especially when audio is corrupted or even inaccessible. Despite the success of audio …
Sub-word level lip reading with visual attention
KR Prajwal, T Afouras… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
The goal of this paper is to learn strong lip reading models that can recognise speech in
silent videos. Most prior works deal with the open-set visual speech recognition problem by …
silent videos. Most prior works deal with the open-set visual speech recognition problem by …
Detecting offensive language in social media to protect adolescent online safety
Since the textual contents on online social media are highly unstructured, informal, and often
misspelled, existing research on message-level offensive language detection cannot …
misspelled, existing research on message-level offensive language detection cannot …
The TORGO database of acoustic and articulatory speech from speakers with dysarthria
F Rudzicz, AK Namasivayam, T Wolff - Language resources and …, 2012 - Springer
This paper describes the acquisition of a new database of dysarthric speech in terms of
aligned acoustics and articulatory data. This database currently includes data from seven …
aligned acoustics and articulatory data. This database currently includes data from seven …
Context-aware driver behavior detection system in intelligent transportation systems
S Al-Sultan, AH Al-Bayatti… - IEEE transactions on …, 2013 - ieeexplore.ieee.org
Vehicular ad hoc networks (VANETs) have emerged as an application of mobile ad hoc
networks (MANETs), which use dedicated short-range communication (DSRC) to allow …
networks (MANETs), which use dedicated short-range communication (DSRC) to allow …
Subword modeling for automatic speech recognition: Past, present, and emerging approaches
K Livescu, E Fosler-Lussier… - IEEE Signal Processing …, 2012 - ieeexplore.ieee.org
Modern automatic speech recognition systems handle large vocabularies of words, making
it infeasible to collect enough repetitions of each word to train individual word models …
it infeasible to collect enough repetitions of each word to train individual word models …
Adapt-and-adjust: Overcoming the long-tail problem of multilingual speech recognition
One crucial challenge of real-world multilingual speech recognition is the long-tailed
distribution problem, where some resource-rich languages like English have abundant …
distribution problem, where some resource-rich languages like English have abundant …
Selftalk: A self-supervised commutative training diagram to comprehend 3d talking faces
Speech-driven 3D face animation technique, extending its applications to various
multimedia fields. Previous research has generated promising realistic lip movements and …
multimedia fields. Previous research has generated promising realistic lip movements and …
Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains
Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised
learning of acoustic features when a second view (eg, articulatory measurements) is …
learning of acoustic features when a second view (eg, articulatory measurements) is …
[HTML][HTML] Perception and hierarchical dynamics
In this paper, we suggest that perception could be modeled by assuming that sensory input
is generated by a hierarchy of attractors in a dynamic system. We describe a mathematical …
is generated by a hierarchy of attractors in a dynamic system. We describe a mathematical …