Affordancellm: Grounding affordance from vision language models
Affordance grounding refers to the task of finding the area of an object with which one can
interact. It is a fundamental but challenging task as a successful solution requires the …
interact. It is a fundamental but challenging task as a successful solution requires the …
Locate: Localize and transfer object parts for weakly supervised affordance grounding
Humans excel at acquiring knowledge through observation. For example, we can learn to
use new tools by watching demonstrations. This skill is fundamental for intelligent systems to …
use new tools by watching demonstrations. This skill is fundamental for intelligent systems to …
Grounding 3d object affordance from 2d interactions in images
Grounding 3D object affordance seeks to locate objects'" action possibilities" regions in the
3D space, which serves as a link between perception and operation for embodied agents …
3D space, which serves as a link between perception and operation for embodied agents …
One-shot open affordance learning with foundation models
Abstract We introduce One-shot Open Affordance Learning (OOAL) where a model is trained
with just one example per base object category but is expected to identify novel objects and …
with just one example per base object category but is expected to identify novel objects and …
What does CLIP know about peeling a banana?
Humans show an innate capability to identify tools to support specific actions. The
association between objects parts and the actions they facilitate is usually named …
association between objects parts and the actions they facilitate is usually named …
Weakly Supervised Multimodal Affordance Grounding for Egocentric Images
To enhance the interaction between intelligent systems and the environment, locating the
affordance regions of objects is crucial. These regions correspond to specific areas that …
affordance regions of objects is crucial. These regions correspond to specific areas that …
Self-Explainable Affordance Learning with Embodied Caption
In the field of visual affordance learning, previous methods mainly used abundant images or
videos that delineate human behavior patterns to identify action possibility regions for object …
videos that delineate human behavior patterns to identify action possibility regions for object …
Learning 2D Invariant Affordance Knowledge for 3D Affordance Grounding
3D Object Affordance Grounding aims to predict the functional regions on a 3D object and
has laid the foundation for a wide range of applications in robotics. Recent advances tackle …
has laid the foundation for a wide range of applications in robotics. Recent advances tackle …
INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding
Affordance denotes the potential interactions inherent in objects. The perception of
affordance can enable intelligent agents to navigate and interact with new environments …
affordance can enable intelligent agents to navigate and interact with new environments …
EgoChoir: Capturing 3D Human-Object Interaction Regions from Egocentric Views
Understanding egocentric human-object interaction (HOI) is a fundamental aspect of human-
centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric …
centric perception, facilitating applications like AR/VR and embodied AI. For the egocentric …