Articulatory feature-based methods for acoustic and audio-visual speech recognition: Summary...

Z Zhou, G Zhao, X Hong, M Pietikäinen - Image and vision computing, 2014 - Elsevier

Visual speech information plays an important role in automatic speech recognition (ASR)
especially when audio is corrupted or even inaccessible. Despite the success of audio …

被引用次数：227 相关文章所有 4 个版本

[PDF] thecvf.com

Sub-word level lip reading with visual attention

KR Prajwal, T Afouras… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

The goal of this paper is to learn strong lip reading models that can recognise speech in
silent videos. Most prior works deal with the open-set visual speech recognition problem by …

被引用次数：88 相关文章所有 12 个版本

[PDF] psu.edu

Detecting offensive language in social media to protect adolescent online safety

Y Chen, Y Zhou, S Zhu, H Xu - … security, risk and trust and 2012 …, 2012 - ieeexplore.ieee.org

Since the textual contents on online social media are highly unstructured, informal, and often
misspelled, existing research on message-level offensive language detection cannot …

被引用次数：852 相关文章所有 10 个版本

[PDF] researchgate.net

The TORGO database of acoustic and articulatory speech from speakers with dysarthria

F Rudzicz, AK Namasivayam, T Wolff - Language resources and …, 2012 - Springer

This paper describes the acquisition of a new database of dysarthric speech in terms of
aligned acoustics and articulatory data. This database currently includes data from seven …

被引用次数：345 相关文章所有 18 个版本

[PDF] researchgate.net

Context-aware driver behavior detection system in intelligent transportation systems

S Al-Sultan, AH Al-Bayatti… - IEEE transactions on …, 2013 - ieeexplore.ieee.org

Vehicular ad hoc networks (VANETs) have emerged as an application of mobile ad hoc
networks (MANETs), which use dedicated short-range communication (DSRC) to allow …

被引用次数：260 相关文章所有 4 个版本

Subword modeling for automatic speech recognition: Past, present, and emerging approaches

K Livescu, E Fosler-Lussier… - IEEE Signal Processing …, 2012 - ieeexplore.ieee.org

Modern automatic speech recognition systems handle large vocabularies of words, making
it infeasible to collect enough repetitions of each word to train individual word models …

被引用次数：60 相关文章所有 3 个版本

[PDF] arxiv.org

Adapt-and-adjust: Overcoming the long-tail problem of multilingual speech recognition

GI Winata, G Wang, C Xiong, S Hoi - arXiv preprint arXiv:2012.01687, 2020 - arxiv.org

One crucial challenge of real-world multilingual speech recognition is the long-tailed
distribution problem, where some resource-rich languages like English have abundant …

被引用次数：51 相关文章所有 7 个版本

[PDF] arxiv.org

Selftalk: A self-supervised commutative training diagram to comprehend 3d talking faces

Z Peng, Y Luo, Y Shi, H Xu, X Zhu, H Liu, J He… - Proceedings of the 31st …, 2023 - dl.acm.org

Speech-driven 3D face animation technique, extending its applications to various
multimedia fields. Previous research has generated promising realistic lip movements and …

被引用次数：15 相关文章所有 3 个版本

[PDF] psu.edu

Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains

R Arora, K Livescu - 2013 IEEE International Conference on …, 2013 - ieeexplore.ieee.org

Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised
learning of acoustic features when a second view (eg, articulatory measurements) is …

被引用次数：103 相关文章所有 13 个版本

[HTML] frontiersin.org

[HTML][HTML] Perception and hierarchical dynamics

SJ Kiebel, J Daunizeau, KJ Friston - Frontiers in neuroinformatics, 2009 - frontiersin.org

In this paper, we suggest that perception could be modeled by assuming that sensory input
is generated by a hierarchy of attractors in a dynamic system. We describe a mathematical …

被引用次数：118 相关文章所有 13 个版本