Artificial intelligence for multimodal data integration in oncology
In oncology, the patient state is characterized by a whole spectrum of modalities, ranging
from radiology, histology, and genomics to electronic health records. Current artificial …
from radiology, histology, and genomics to electronic health records. Current artificial …
An overview of deep-learning-based audio-visual speech enhancement and separation
Speech enhancement and speech separation are two related tasks, whose purpose is to
extract either one or more target speech signals, respectively, from a mixture of sounds …
extract either one or more target speech signals, respectively, from a mixture of sounds …
Revisiting skeleton-based action recognition
Human skeleton, as a compact representation of human action, has received increasing
attention in recent years. Many skeleton-based action recognition methods adopt GCNs to …
attention in recent years. Many skeleton-based action recognition methods adopt GCNs to …
A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets
K Bayoudh, R Knani, F Hamdaoui, A Mtibaa - The Visual Computer, 2022 - Springer
The research progress in multimodal learning has grown rapidly over the last decade in
several areas, especially in computer vision. The growing potential of multimodal data …
several areas, especially in computer vision. The growing potential of multimodal data …
Star-transformer: a spatio-temporal cross attention transformer for human action recognition
In action recognition, although the combination of spatio-temporal videos and skeleton
features can improve the recognition performance, a separate model and balancing feature …
features can improve the recognition performance, a separate model and balancing feature …
Expansion-squeeze-excitation fusion network for elderly activity recognition
This work focuses on the task of elderly activity recognition, which is a challenging task due
to the existence of individual actions and human-object interactions in elderly activities …
to the existence of individual actions and human-object interactions in elderly activities …
Delivering arbitrary-modal semantic segmentation
Multimodal fusion can make semantic segmentation more robust. However, fusing an
arbitrary number of modalities remains underexplored. To delve into this problem, we create …
arbitrary number of modalities remains underexplored. To delve into this problem, we create …
Computer vision, IoT and data fusion for crop disease detection using machine learning: A survey and ongoing research
Crop diseases constitute a serious issue in agriculture, affecting both quality and quantity of
agriculture production. Disease control has been a research object in many scientific and …
agriculture production. Disease control has been a research object in many scientific and …
Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks
We hypothesize that due to the greedy nature of learning in multi-modal deep neural
networks, these models tend to rely on just one modality while under-fitting the other …
networks, these models tend to rely on just one modality while under-fitting the other …
Provable dynamic fusion for low-quality multimodal data
The inherent challenge of multimodal fusion is to precisely capture the cross-modal
correlation and flexibly conduct cross-modal interaction. To fully release the value of each …
correlation and flexibly conduct cross-modal interaction. To fully release the value of each …