Graspdiffusion: Synthesizing realistic whole-body hand-object interaction

P Kwon, H Joo - arXiv preprint arXiv:2410.13911, 2024 - arxiv.org
Recent generative models can synthesize high-quality images but often fail to generate
humans interacting with objects using their hands. This arises mostly from the model's …

Multi-Modal Diffusion for Hand-Object Grasp Generation

J Cao, J Liu, K Kitani, Y Zhou - arXiv preprint arXiv:2409.04560, 2024 - arxiv.org
In this work, we focus on generating hand grasp over objects. Compared to previous works
of generating hand poses with a given object, we aim to allow the generalization of both …

3D Whole-body Grasp Synthesis with Directional Controllability

G Paschalidis, R Wilschut, D Antić, O Taheri… - arXiv preprint arXiv …, 2024 - arxiv.org
Synthesizing 3D whole-bodies that realistically grasp objects is useful for animation, mixed
reality, and robotics. This is challenging, because the hands and body need to look natural …

MADiff: Motion-aware mamba diffusion models for hand trajectory prediction on egocentric videos

J Ma, X Chen, W Bao, J Xu, H Wang - arXiv preprint arXiv:2409.02638, 2024 - arxiv.org
Understanding human intentions and actions through egocentric videos is important on the
path to embodied artificial intelligence. As a branch of egocentric vision techniques, hand …

Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

J Ma, J Xu, X Chen, H Wang - arXiv preprint arXiv:2405.04370, 2024 - arxiv.org
Understanding how humans would behave during hand-object interaction is vital for
applications in service robot manipulation and extended reality. To achieve this, some …

InterDyn: Controllable Interactive Dynamics with Video Diffusion Models

R Akkerman, H Feng, MJ Black, D Tzionas… - arXiv preprint arXiv …, 2024 - arxiv.org
Predicting the dynamics of interacting objects is essential for both humans and intelligent
systems. However, existing approaches are limited to simplified, toy settings and lack …

ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping

Y Pang, R Shao, J Zhang, H Tu, Y Liu, B Zhou… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we introduce ManiVideo, a novel method for generating consistent and
temporally coherent bimanual hand-object manipulation videos from given motion …

GraspDiff: Grasping Generation for Hand-Object Interaction With Multimodal Guided Diffusion

B Zuo, Z Zhao, W Sun, X Yuan, Z Yu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Grasping generation holds significant importance in both robotics and AI-generated content.
While pure network paradigms based on VAEs or GANs ensure diversity in outcomes, they …

Human Action Anticipation: A Survey

B Lai, S Toyer, T Nagarajan, R Girdhar, S Zha… - arXiv preprint arXiv …, 2024 - arxiv.org
Predicting future human behavior is an increasingly popular topic in computer vision, driven
by the interest in applications such as autonomous vehicles, digital assistants and human …

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness

Z Xue, M Luo, C Chen, K Grauman - arXiv preprint arXiv:2406.07754, 2024 - arxiv.org
We study the problem of precisely swapping objects in videos, with a focus on those
interacted with by hands, given one user-provided reference object image. Despite the great …