Making sense of vision and touch: Self-supervised learning of multimodal representations...

J Ibarz, J Tan, C Finn, M Kalakrishnan… - … Journal of Robotics …, 2021 - journals.sagepub.com

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously
acquiring complex behaviors from low-level sensor observations. Although a large portion of …

被引用次数：667 相关文章所有 7 个版本

[PDF] arxiv.org

Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

PP Liang, A Zadeh, LP Morency - arXiv preprint arXiv:2209.03430, 2022 - arxiv.org

Multimodal machine learning is a vibrant multi-disciplinary research field that aims to design
computer agents with intelligent capabilities such as understanding, reasoning, and learning …

被引用次数：147 相关文章所有 2 个版本

[PDF] thecvf.com

Revisiting self-supervised visual representation learning

A Kolesnikov, X Zhai, L Beyer - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Unsupervised visual representation learning remains a largely unsolved problem in
computer vision research. Among a big body of recently proposed approaches for …

被引用次数：903 相关文章所有 10 个版本

[PDF] arxiv.org

Digit: A novel design for a low-cost compact high-resolution tactile sensor with application to in-hand manipulation

M Lambeta, PW Chou, S Tian, B Yang… - IEEE Robotics and …, 2020 - ieeexplore.ieee.org

Despite decades of research, general purpose in-hand manipulation remains one of the
unsolved challenges of robotics. One of the contributing factors that limit current robotic …

被引用次数：506 相关文章所有 7 个版本

[PDF] arxiv.org

Calvin: A benchmark for language-conditioned policy learning for long-horizon robot manipulation tasks

O Mees, L Hermann, E Rosete-Beas… - IEEE Robotics and …, 2022 - ieeexplore.ieee.org

General-purpose robots coexisting with humans in their environment must learn to relate
human language to their perceptions and actions to be useful in a range of daily tasks …

被引用次数：203 相关文章所有 5 个版本

[PDF] arxiv.org

Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review

G Du, K Wang, S Lian, K Zhao - Artificial Intelligence Review, 2021 - Springer

This paper presents a comprehensive survey on vision-based robotic grasping. We
conclude three key tasks during vision-based robotic grasping, which are object localization …

被引用次数：471 相关文章所有 7 个版本

[HTML] nih.gov

[HTML][HTML] Multibench: Multiscale benchmarks for multimodal representation learning

PP Liang, Y Lyu, X Fan, Z Wu, Y Cheng… - Advances in neural …, 2021 - ncbi.nlm.nih.gov

Learning multimodal representations involves integrating information from multiple
heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world …

被引用次数：166 相关文章所有 9 个版本

[PDF] mdpi.com

Variable compliance control for robotic peg-in-hole assembly: A deep-reinforcement-learning approach

CC Beltran-Hernandez, D Petit, IG Ramirez-Alpizar… - Applied Sciences, 2020 - mdpi.com

Featured Application Assembly tasks with industrial robot manipulators. Abstract Industrial
robot manipulators are playing a significant role in modern manufacturing industries …

被引用次数：181 相关文章所有 12 个版本

[PDF] jmlr.org

A review of robot learning for manipulation: Challenges, representations, and algorithms

O Kroemer, S Niekum, G Konidaris - Journal of machine learning research, 2021 - jmlr.org

A key challenge in intelligent robotics is creating robots that are capable of directly
interacting with the world around them to achieve their goals. The last decade has seen …

被引用次数：435 相关文章所有 18 个版本

[PDF] neurips.cc

Incomplete multimodality-diffused emotion recognition

Y Wang, Y Li, Z Cui - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Human multimodal emotion recognition (MER) aims to perceive and understand human
emotions via various heterogeneous modalities, such as language, vision, and acoustic …

被引用次数：29 相关文章所有 4 个版本