Multimodal machine learning: A survey and taxonomy

T Baltrušaitis, C Ahuja… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …

Recent advances in the automatic recognition of audiovisual speech

G Potamianos, C Neti, G Gravier, A Garg… - Proceedings of the …, 2003 - ieeexplore.ieee.org
Visual speech information from the speaker's mouth region has been successfully shown to
improve noise robustness of automatic speech recognizers, thus promising to extend their …

Audio-visual speech recognition using deep learning

K Noda, Y Yamaguchi, K Nakadai, HG Okuno… - Applied intelligence, 2015 - Springer
Audio-visual speech recognition (AVSR) system is thought to be one of the most promising
solutions for reliable speech recognition, particularly when the audio is corrupted by noise …

Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos

O Koller, NC Camgoz, H Ney… - IEEE transactions on …, 2019 - ieeexplore.ieee.org
In this work we present a new approach to the field of weakly supervised learning in the
video domain. Our method is relevant to sequence learning problems which can be split up …

Robust automatic speech recognition with missing and unreliable acoustic data

M Cooke, P Green, L Josifovski, A Vizinho - Speech communication, 2001 - Elsevier
Human speech perception is robust in the face of a wide variety of distortions, both
experimentally applied and naturally occurring. In these conditions, state-of-the-art …

[PDF][PDF] Audio-visual automatic speech recognition: An overview

G Potamianos, C Neti, J Luettin… - Issues in visual and audio …, 2004 - academia.edu
We have made significant progress in automatic speech recognition (ASR) for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

Adaptive cache compression for high-performance processors

AR Alameldeen, DA Wood - ACM SIGARCH Computer Architecture …, 2004 - dl.acm.org
Modern processors use two or more levels ofcache memories to bridge the rising disparity
betweenprocessor and memory speeds. Compression canimprove cache performance by …

Automatic facial expression recognition using facial animation parameters and multistream HMMs

PS Aleksic, AK Katsaggelos - IEEE Transactions on Information …, 2006 - ieeexplore.ieee.org
The performance of an automatic facial expression recognition system can be significantly
improved by modeling the reliability of different streams of facial expression information …

[PDF][PDF] Audio visual speech recognition

C Neti, G Potamianos, J Luettin, I Matthews, H Glotin… - 2000 - infoscience.epfl.ch
We have made significant progress in automatic speech recognition ASR for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …

Robust speaker recognition in noisy conditions

J Ming, TJ Hazen, JR Glass… - IEEE Transactions on …, 2007 - ieeexplore.ieee.org
This paper investigates the problem of speaker identification and verification in noisy
conditions, assuming that speech signals are corrupted by environmental noise, but …