Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
This paper considers the problem of Multi-Hop Video Question Answering (MH-VidQA) in
long-form egocentric videos. This task not only requires to answer visual questions, but also …
long-form egocentric videos. This task not only requires to answer visual questions, but also …
ActionVOS: Actions as Prompts for Video Object Segmentation
Delving into the realm of egocentric vision, the advancement of referring video object
segmentation (RVOS) stands as pivotal in understanding human activities. However …
segmentation (RVOS) stands as pivotal in understanding human activities. However …
AMEGO: Active Memory from long EGOcentric videos
Egocentric videos provide a unique perspective into individuals' daily experiences, yet their
unstructured nature presents challenges for perception. In this paper, we introduce AMEGO …
unstructured nature presents challenges for perception. In this paper, we introduce AMEGO …