Survey on automatic lip-reading in the era of deep learning
A Fernandez-Lopez, FM Sukno - Image and Vision Computing, 2018 - Elsevier
In the last few years, there has been an increasing interest in developing systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …
Deep learning-based automated lip-reading: A survey
A survey on automated lip-reading approaches is presented in this paper with the main
focus being on deep learning related methodologies which have proven to be more fruitful …
focus being on deep learning related methodologies which have proven to be more fruitful …
Lip reading sentences using deep learning with only visual cues
In this paper, a neural network-based lip reading system is proposed. The system is lexicon-
free and uses purely visual cues. With only a limited number of visemes as classes to …
free and uses purely visual cues. With only a limited number of visemes as classes to …
Lipformer: learning to lipread unseen speakers based on visual-landmark transformers
F Xue, Y Li, D Liu, Y Xie, L Wu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Lipreading refers to understanding and further translating the speech of a video speaker into
textual outputs. State-of-the-art lipreading methods excel in interpreting overlap speakers, ie …
textual outputs. State-of-the-art lipreading methods excel in interpreting overlap speakers, ie …
A survey of research on lipreading technology
M Hao, M Mamut, N Yadikar, A Aysa, K Ubul - IEEE Access, 2020 - ieeexplore.ieee.org
Although automatic speech recognition (ASR) technology is mature, there are still some
unsolved problems, such as how to accurately identify what the speaker is saying in a noisy …
unsolved problems, such as how to accurately identify what the speaker is saying in a noisy …
Review on research progress of machine lip reading
G Pu, H Wang - The Visual Computer, 2023 - Springer
Abstract Machine lip reading recognizes text content through the speaker's lip motion
information. Lip reading has significant research and application value. With the continuous …
information. Lip reading has significant research and application value. With the continuous …
CATNet: Cross-modal fusion for audio–visual speech recognition
X Wang, J Mi, B Li, Y Zhao, J Meng - Pattern Recognition Letters, 2024 - Elsevier
Automatic speech recognition (ASR) is a typical pattern recognition technology that converts
human speeches into texts. With the aid of advanced deep learning models, the …
human speeches into texts. With the aid of advanced deep learning models, the …
[HTML][HTML] Research on a Lip Reading Algorithm Based on Efficient-GhostNet
G Zhang, Y Lu - Electronics, 2023 - mdpi.com
Lip reading technology refers to the analysis of the visual information of the speaker's mouth
movements to recognize the content of the speaker's speech. As one of the important …
movements to recognize the content of the speaker's speech. As one of the important …
A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Body language (BL) refers to the non-verbal communication expressed through physical
movements, gestures, facial expressions, and postures. It is a form of communication that …
movements, gestures, facial expressions, and postures. It is a form of communication that …
Multi-Scale Hybrid Fusion Network for Mandarin Audio-Visual Speech Recognition
J Wang, Z Guo, C Yang, X Li… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Compared to feature or decision fusion, hybrid fusion can beneficially improve audio-visual
speech recognition accuracy. Existing works are mainly prone to design the multi-modality …
speech recognition accuracy. Existing works are mainly prone to design the multi-modality …