Multimodal machine learning: A survey and taxonomy
T Baltrušaitis, C Ahuja… - IEEE transactions on …, 2018 - ieeexplore.ieee.org
Our experience of the world is multimodal-we see objects, hear sounds, feel texture, smell
odors, and taste flavors. Modality refers to the way in which something happens or is …
odors, and taste flavors. Modality refers to the way in which something happens or is …
Recent advances in the automatic recognition of audiovisual speech
Visual speech information from the speaker's mouth region has been successfully shown to
improve noise robustness of automatic speech recognizers, thus promising to extend their …
improve noise robustness of automatic speech recognizers, thus promising to extend their …
Audio-visual speech recognition using deep learning
Audio-visual speech recognition (AVSR) system is thought to be one of the most promising
solutions for reliable speech recognition, particularly when the audio is corrupted by noise …
solutions for reliable speech recognition, particularly when the audio is corrupted by noise …
Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos
In this work we present a new approach to the field of weakly supervised learning in the
video domain. Our method is relevant to sequence learning problems which can be split up …
video domain. Our method is relevant to sequence learning problems which can be split up …
Robust automatic speech recognition with missing and unreliable acoustic data
Human speech perception is robust in the face of a wide variety of distortions, both
experimentally applied and naturally occurring. In these conditions, state-of-the-art …
experimentally applied and naturally occurring. In these conditions, state-of-the-art …
[PDF][PDF] Audio-visual automatic speech recognition: An overview
G Potamianos, C Neti, J Luettin… - Issues in visual and audio …, 2004 - academia.edu
We have made significant progress in automatic speech recognition (ASR) for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …
applications like dictation and medium vocabulary transaction processing tasks in relatively …
Adaptive cache compression for high-performance processors
AR Alameldeen, DA Wood - ACM SIGARCH Computer Architecture …, 2004 - dl.acm.org
Modern processors use two or more levels ofcache memories to bridge the rising disparity
betweenprocessor and memory speeds. Compression canimprove cache performance by …
betweenprocessor and memory speeds. Compression canimprove cache performance by …
Automatic facial expression recognition using facial animation parameters and multistream HMMs
PS Aleksic, AK Katsaggelos - IEEE Transactions on Information …, 2006 - ieeexplore.ieee.org
The performance of an automatic facial expression recognition system can be significantly
improved by modeling the reliability of different streams of facial expression information …
improved by modeling the reliability of different streams of facial expression information …
[PDF][PDF] Audio visual speech recognition
We have made significant progress in automatic speech recognition ASR for well-defined
applications like dictation and medium vocabulary transaction processing tasks in relatively …
applications like dictation and medium vocabulary transaction processing tasks in relatively …
Robust speaker recognition in noisy conditions
This paper investigates the problem of speaker identification and verification in noisy
conditions, assuming that speech signals are corrupted by environmental noise, but …
conditions, assuming that speech signals are corrupted by environmental noise, but …