LEMON: Learning 3D Human-Object Interaction Relation from 2D Images
Learning 3D human-object interaction relation is pivotal to embodied AI and interaction
modeling. Most existing methods approach the goal by learning to predict isolated …
modeling. Most existing methods approach the goal by learning to predict isolated …
Grounded affordance from exocentric view
Affordance grounding aims to locate objects'“action possibilities” regions, an essential step
toward embodied intelligence. Due to the diversity of interactive affordance, ie, the …
toward embodied intelligence. Due to the diversity of interactive affordance, ie, the …
Mambapupil: Bidirectional selective recurrent model for event-based eye tracking
Event-based eye tracking has shown great promise with the high temporal resolution and
low redundancy provided by the event camera. However the diversity and abruptness of eye …
low redundancy provided by the event camera. However the diversity and abruptness of eye …
Tackling Event-Based Lip-Reading by Exploring Multigrained Spatiotemporal Clues
G Tan, Z Wan, Y Wang, Y Cao… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Automatic lip-reading (ALR) is the task of recognizing words based on visual information
obtained from the speaker's lip movements. In this study, we introduce event cameras, a …
obtained from the speaker's lip movements. In this study, we introduce event cameras, a …
Hierarchical home action understanding with implicit and explicit prior knowledge
Existing investigations on action understanding have made noteworthy advancements by
treating activities as holistic events occurring in videos. However, these investigations have …
treating activities as holistic events occurring in videos. However, these investigations have …
Branches mutual promotion for end-to-end weakly supervised semantic segmentation
End-to-end weakly supervised semantic segmentation (E2E-WSSS) aims at optimizing a
segmentation model in a single-stage training process based on only image annotations …
segmentation model in a single-stage training process based on only image annotations …
Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation
Recent attention has been devoted to the pursuit of learning semantic segmentation models
exclusively from image tags, a paradigm known as image-level Weakly Supervised …
exclusively from image tags, a paradigm known as image-level Weakly Supervised …
Adaptive Zone Learning for Weakly Supervised Object Localization
Weakly supervised object localization (WSOL) stands as a pivotal endeavor within the realm
of computer vision, entailing the location of objects utilizing merely image-level labels …
of computer vision, entailing the location of objects utilizing merely image-level labels …
Foreground–background separation transformer for weakly supervised surface defect detection
X Jiang, J Feng, F Yan, Y Lu, Q Fa, W Zhang… - Journal of Intelligent …, 2024 - Springer
In industrial scenarios, weakly supervised pixel-level defect detection methods leverage
image-level labels for training, significantly reducing the effort required for manual …
image-level labels for training, significantly reducing the effort required for manual …
PCSformer: Pair-wise Cross-scale Sub-prototypes mining with CNN-transformers for weakly supervised semantic segmentation
C Liu, Y Shen, Q Xiao, G Li - Neurocomputing, 2024 - Elsevier
Generating initial seeds is an important step in weakly supervised semantic segmentation
(WSSS). Our approach concentrates on generating and refining initial seeds. The …
(WSSS). Our approach concentrates on generating and refining initial seeds. The …