[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov
Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

Quantifying & modeling multimodal interactions: An information decomposition framework

PP Liang, Y Cheng, X Fan, CK Ling… - Advances in …, 2024 - proceedings.neurips.cc
The recent explosion of interest in multimodal applications has resulted in a wide selection
of datasets and methods for representing and integrating information from different …

Review of bioinspired vision-tactile fusion perception (VTFP): From humans to humanoids

B He, Q Miao, Y Zhou, Z Wang, G Li… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Humanoid robots are designed and expected to resemble humans in structure and
behavior, showing increasing application potentials in various fields. Like their biological …

An overview of differentiable particle filters for data-adaptive sequential Bayesian inference

X Chen, Y Li - arXiv preprint arXiv:2302.09639, 2023 - arxiv.org
By approximating posterior distributions with weighted samples, particle filters (PFs) provide
an efficient mechanism for solving non-linear sequential state estimation problems. While …

See, hear, and feel: Smart sensory fusion for robotic manipulation

H Li, Y Zhang, J Zhu, S Wang, MA Lee, H Xu… - arXiv preprint arXiv …, 2022 - arxiv.org
Humans use all of their senses to accomplish different tasks in everyday activities. In
contrast, existing work on robotic manipulation mostly relies on one, or occasionally two …

Vision-force-fused curriculum learning for robotic contact-rich assembly tasks

P Jin, Y Lin, Y Song, T Li, W Yang - Frontiers in Neurorobotics, 2023 - frontiersin.org
Contact-rich robotic manipulation tasks such as assembly are widely studied due to their
close relevance with social and manufacturing industries. Although the task is highly related …

Learning vision-based pursuit-evasion robot policies

A Bajcsy, A Loquercio, A Kumar… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org
Learning strategic robot behavior—like that required in pursuit-evasion interactions—under
real-world constraints is extremely challenging. It requires exploiting the dynamics of the …

High-modality multimodal transformer: Quantifying modality & interaction heterogeneity for high-modality representation learning

PP Liang, Y Lyu, X Fan, J Tsaw, Y Liu, S Mo… - arXiv preprint arXiv …, 2022 - arxiv.org
Many real-world problems are inherently multimodal, from spoken language, gestures, and
paralinguistics humans use to communicate, to force, proprioception, and visual sensors on …

Enhancing state estimation in robots: A data-driven approach with differentiable ensemble kalman filters

X Liu, G Clark, J Campbell, Y Zhou… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org
This paper introduces a novel state estimation framework for robots using differentiable
ensemble Kalman filters (DEnKF). DEnKF is a reformulation of the traditional ensemble …

Intra-and Inter-Modal Curriculum for Multimodal Learning

Y Zhou, X Wang, H Chen, X Duan, W Zhu - Proceedings of the 31st ACM …, 2023 - dl.acm.org
Multimodal learning has been widely studied and applied due to its improvement over
previous unimodal tasks and its effectiveness on emerging multimodal challenges …