Survey on automatic lip-reading in the era of deep learning

A Fernandez-Lopez, FM Sukno - Image and Vision Computing, 2018 - Elsevier
In the last few years, there has been an increasing interest in developing systems for
Automatic Lip-Reading (ALR). Similarly to other computer vision applications, methods …

Deep learning-based automated lip-reading: A survey

S Fenghour, D Chen, K Guo, B Li, P Xiao - IEEE Access, 2021 - ieeexplore.ieee.org
A survey on automated lip-reading approaches is presented in this paper with the main
focus being on deep learning related methodologies which have proven to be more fruitful …

Lip reading sentences using deep learning with only visual cues

S Fenghour, D Chen, K Guo, P Xiao - IEEE Access, 2020 - ieeexplore.ieee.org
In this paper, a neural network-based lip reading system is proposed. The system is lexicon-
free and uses purely visual cues. With only a limited number of visemes as classes to …

Lipformer: learning to lipread unseen speakers based on visual-landmark transformers

F Xue, Y Li, D Liu, Y Xie, L Wu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Lipreading refers to understanding and further translating the speech of a video speaker into
textual outputs. State-of-the-art lipreading methods excel in interpreting overlap speakers, ie …

A survey of research on lipreading technology

M Hao, M Mamut, N Yadikar, A Aysa, K Ubul - IEEE Access, 2020 - ieeexplore.ieee.org
Although automatic speech recognition (ASR) technology is mature, there are still some
unsolved problems, such as how to accurately identify what the speaker is saying in a noisy …

Review on research progress of machine lip reading

G Pu, H Wang - The Visual Computer, 2023 - Springer
Abstract Machine lip reading recognizes text content through the speaker's lip motion
information. Lip reading has significant research and application value. With the continuous …

CATNet: Cross-modal fusion for audio–visual speech recognition

X Wang, J Mi, B Li, Y Zhao, J Meng - Pattern Recognition Letters, 2024 - Elsevier
Automatic speech recognition (ASR) is a typical pattern recognition technology that converts
human speeches into texts. With the aid of advanced deep learning models, the …

[HTML][HTML] Research on a Lip Reading Algorithm Based on Efficient-GhostNet

G Zhang, Y Lu - Electronics, 2023 - mdpi.com
Lip reading technology refers to the analysis of the visual information of the speaker's mouth
movements to recognize the content of the speaker's speech. As one of the important …

A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation

L Liu, L Gao, W Lei, F Ma, X Lin, J Wang - arXiv preprint arXiv:2308.08849, 2023 - arxiv.org
Body language (BL) refers to the non-verbal communication expressed through physical
movements, gestures, facial expressions, and postures. It is a form of communication that …

Multi-Scale Hybrid Fusion Network for Mandarin Audio-Visual Speech Recognition

J Wang, Z Guo, C Yang, X Li… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Compared to feature or decision fusion, hybrid fusion can beneficially improve audio-visual
speech recognition accuracy. Existing works are mainly prone to design the multi-modality …